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PLANT GENES FOR SENSITIVITY TO ETHYLENE AND PATHOGENS 

REFERENCE TO RELATED APPLICATIONS 

This application is a continuation-in-part of 
U.S. application Serial No. 08/003,311, filed January 12, 
1993, a continuation-in-part of U.S. application Serial No. 
928,464, filed August 10, 1992; this application is also a 
continuation-in-part of U.S. application Serial No. 
08/171,207, filed December 21, 1993, which is a 
continuation of U.S. application Serial No. 899,262, filed 
June 16, 1992, now abandoned; the disclosures of which are 
hereby incorporated in their entirety. 

REFERENCE TO GOVERNMENT GRANTS 

This work was supported in part by research 
grants from the National Institutes of Health GM-26379 
15 and National Science Foundation grant IBN-92-05342 . The 
United States Government may have certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 

Ethylene, a gaseous plant hormone, is involved in 

20 the regulation of a number of plant processes ranging from 
growth and development to fruit ripening. As in animal 
systems, response of plants to disease not only involves 
static processes, but also involves inducible defense 
mechanisms. One of the earliest detectable event to occur 

25 during plant -pathogen interaction is a rapid increase in 
ethylene biosynthesis. Ethylene biosynthesis, in response 
to pathogen invasion, correlates with increased defense 
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mechanisms/ chlorosis, senescenc and abscission. The 
molecular mechanisms underlying operation of ethylene 
action, however, are unknown. Nonetheless, ethylene 
produced in response to biological stress is known to 
5 regulate the rate of transcription of specific plant genes. 
A variety of biological stresses can induce ethylene 
production in plants including wounding, bacterial, viral 
or fungal infection as can treatment with elicitors, such 
as glycopeptide elicitor preparations (prepared by chemical 

10 extraction from fungal pathogen cells) . Researchers have 
found, for example, that treatment of plants with ethylene 
generally increases the level of many pathogen- inducible 
"defense proteins" , including 0-1, 3-glucanase, chitinase, 
L-phenylalanine ammonia lyase, and hydroxyproline-rich 

15 glycoproteins. The genes for these proteins can be 
transcriptionally activated by ethylene and their 
expression can be blocked by inhibitors of ethylene 
biosynthesis. Researchers have also characterized a normal 
plant response to the production or administration of 

20 ethylene, as a so-called "triple response". The triple 
response involves inhibition of root and stem elongation, 
radial swelling of the stem and absence of normal geotropic 
response (diageotropism) • 

Ethylene is one of five well-established plant 

25 hormones. It mediates a diverse array of plant responses 
including fruit ripening, leaf abscission and flower 
senescence. 

The pathway for ethylene biosynthesis has been 
established (Figure 6) . Methionine is converted to 

30 ethylene with S-adenylmethionine (SAM) and 

1-aminocyclopropane-l-carboxylic acid (ACC) as 
intermediates. The production of ACC from SAM is catalyzed 
by the enzyme ACC synthase. Physiological analysis has 
suggested that this is the key regulatory step in the 

35 pathway, see Kende, Plant Physiol. 1989, 51, 1-4. This 
enzyme has been cloned from several sources, see Sato et 
al., PNAS, (USA) 1989, 86, 6621; Van Der Straeten et al. , 
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PNAS, (USA) 1990, 87, 4859-4863; Nakaj ima et al. , Plant 
Cell Physiol. 1990, 29, 989. The conversion of ACC to 
ethylene is catalyzed by ethylene forming enzyme (EFE) , 
which has been recently cloned (Spanu et al., EMBO J 1991, 
5 10, 2007. Aminoethoxy-vinylglycine (AVG) and 

a-aminoisobutyric acid (AIB) have been shown to inhibit ACC 
synthase and EFE respectively. Ethylene binding is 
inhibited non- competitively by silver, and competitively by 
several compounds, the most effective of which is 

10 trans -cyclooctane. ACC synthase is encoded by a highly 

divergent gene family in tomato and Arabidopsls (Theologis, 
A w Cell 70:181 (1992)). ACC oxidase, which converts ACC 
to ethylene, is expressed constitutively in most tissues 
(Yang et al., Ann. Rev. Plant Physiol. 1984, 35, 155), but 

15 is induced during fruit ripening (Gray et al. Cell 1993 72, 
427) . It has been shown to be a dioxygenase belonging to 
the Fe2+/ascorbate oxidase superfamily (McGarvey et al.. 
Plant Physiol. 1992, 98, 554). 

Etiolated dicotyledonous seedlings are normally 

20 highly elongated and display an apical arch- shaped 

structure at the terminal part of the shoot axis; the 
apical hook. The effect of ethylene on dark grown 
seedlings, the triple response, was first described in peas 
by Neljubow in 1901, Neljubow, D* , Pflanzen Beih. Bot. 

25 Zentralb. , 1901, 10, 128. In Arabidopsis, a typical triple 
response consists of a shortening and radial swelling of 
the hypocotyl, an inhibition of root elongation and an 
exaggeration of the curvature of the apical hook (Figures 7 
and 16) . Etiolated morphology is dramatically altered by 

30 stress conditions which induce ethylene production the 

ethylene -induced "triple response" may provide the seedling 
with additional strength required for penetration of 
compact soils, see Harpham et al., Annals of Bot., 1991, 
68, 55. Ethylene may also be important for other stress 

35 responses. ACC synthase gene expression and ethylene 
production is induced by many types of biological and 
physical stress, such as wounding and pathogen infection, 
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see Boiler, T., in The Plant Hormone Ethylene, A.R. Mattoo 
and J.C. Suttle eds., 293-314, 1991, CRC Press, Inc. Boca 
Raton and Yu, Y. et al., Plant Phys., 1979, 6*3,589, Abeles 
et al. 1992 Second Edition San Diego, CA Academic Press; 
5 and Gray et al. Plant Mol Biol. 1992 29, 69. 

A number of researchers have identified the 
interaction between Arabidopsis thai i ana and Pseudomonas 
syringae bacteria; Whalen et al., "Identification of 
Pseudomonas syringae Pathogens of Arabidopsis and a 

10 Bacterial Locus Determining Avirulence on Both Arabidopsis 
and Soybean", The Plant Cell 1991, 3, 49, Dong et al., 
"Induction of Arabidopsis Defense Genes by Virulent and 
Avirulent Pseudomonas syringae Strains and by a Cloned 
Avirulence Gene", The Plant Cell 1991, 3, 61, and Debener 

15 et al., "Identification and Molecular Mapping of a Single 
Arabidopsis thai i ana Locus Determining Resistance to a 
Phytopathogenic Pseudomonas syringae Isolate", The Plant 
Journal 1991, 1, 289. P. syringae pv. tomato (Pst) strains 
are pathogenic on Arabidopsis. A single bacterial gene, 

20 avrRpt2, was isolated that controls pathogen avirulence on 
specific Arabidopsis host genotype Col-0. 

Bent, A.F., et al., "Disease Development in 
Ethylene -Insensitive Arabidopsis thaliana Infected with 
Virulent and Avirulent Pseudomonas and Xanthomonas 

25 Pathogens", Molecular Plant-Microbe Interactions 1992, 5, 
372; Agrios, G.N., Plant Pathology 1988, 126, Academic 
Press, San Diego; and Mussel, H., "Tolerance to Disease", 
page 40, in Plant Disease: An Advanced Treatise, Volume 5, 
Horsfall, J.G. and Cowling, E.B., eds., 1980, Academic 

30 Press, New York, establish the art recognized definitions 
of tolerance, susceptibility, and resistance. Tolerance is 
defined for purposes of the present invention as growth of 
a pathogen in a plant where the plant does not sustain 
damage. Resistance is defined as the inability of a 

35 pathogen to grow in a plant and no damage to the plant 
results. Susceptibility is indicated by pathogen growth 
with plant damage. 
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Regardless of the molecular mechanisms involved, 
the normal ethylene response of a plant to pathogen 
invasion has been thought to have a cause and effect 
relationship in the ability of a plant to fight off plant 
5 pathogens. Plants insensitive in any fashion to ethylene 
were believed to be incapable of eliciting a proper defense 
response to pathogen invasion, and thus unable to initiate 
proper defense mechanisms. As such, ethylene insensitive 
plants were thought to be less disease tolerant. 

10 The induction of disease responses in plants 

requires recognition of pathogens or pathogen- induced 
symptoms. In a large number of plant-pathogen 
interactions, successful resistance is observed when the 
plant has a resistance gene with functional specificity for 

15 pathogens that carry a particular avirulence gene. If the 
plant and pathogen carry resistance and avirulence genes 
with matched specificity, disease spread is curtailed and a 
hypersensitive response involving localized cell death and 
physical isolation of the pathogen typically occurs. In 

20 the absence of matched resistance and avirulence genes, 
colonization and tissue damage proceed past the site of 
initial infection and disease is observed. 

A better understanding of plant pathogen 
tolerance is needed. Also needed is the development of 

25 methods for improving the tolerance of plants to pathogens, 
as well as the development of easy and efficient methods 
for identifying pathogen tolerant plants. 

Genetic and molecular characterization of several 
gene loci and protein products is set forth in the present 

30 invention. The results will reveal interactions among 
modulatory components of the ethylene action pathway and 
provide insight into how plant hormones function. Thus, 
the quantity, quality and longevity of food, such as fruits 
and vegetables, and other plant products such as flowers, 

35 will be improved thereby providing more products for market 
in both developed and underdeveloped countries. 
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SUMMARY OF THE INVENTION 

The present invention is directed to nucleic acid 
sequences for ethylene insensitive, EIN loci and 
corresponding amino acid sequences. Several ein wild type 
5 sequences, mutations, amino acid sequences, and protein 
products are included within the scope of the present 
invention. The nucleic acid sequences set forth in 
SEQUENCE ID NUMBERS 1 and 2 for ein2 ; 4, 5, 7, 9, and 11 
for eln3 and eill, e±12, eiI3; as well as amino acid 

10 sequences set forth in SEQUENCE ID NUMBERS 3 for ein2; 6, 
8, 10, 12, and 13 for ein3 and eill, ei22, eiI3; are 
particular embodiments of the present invention. 

The present invention is also directed to nucleic 
acid sequences for hooklessl, HLS1, alleles and amino acid 

15 sequences. Wild type and mutated nucleic acid sequences, 
amino acid sequences and proteins are included within the 
scope of the present invention. The nucleic acid 
sequences of hlsl are set forth in SEQUENCE ID NUMBERS: 14 
and 15; the amino acid sequences are set forth in SEQUENCE 

20 ID NUMBER: 16. 

These and other aspects of the invention will 
become more apparent from the following detailed 
description when taken in conjunction with the following 
figures. 

25 BRIEF DESCRIPTION OF TEE FIGURES 

Figure 1 displays the EIN2 region on chromosome 5 
of AraJbidopsis thaliana. O represents the left end probe, 
□ represents the right end probe, a length of 100 kb is 
represented in the legend. 

30 Figure 2 is a genomic Southern blot. A 

polymorphism was detected in exn2-12 by hybridization with 
g3715. The g3715 cosmid was hybridized to a genomic 
Southern blot containing several alleles of ein2. In ein2- 
12 EcoR I digested genomic DNA, two bands were missing, 1.2 

35 kb and 4.3 kb; and a new 5.5 kb fragment waB detected. The 
DNA from the ein2 alleles was purified according to Chang 
et al. Proc. Natl. Acad. Sci USA 1988 85, 6857. 5 fig of 
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EcoR I digested DNA was separat d on a 0.8% agarose gel and 
blotted to hybond N* {Sambrook et al., Molecular Cloning: A 
Laboratory Manual, 2nd ed., 1989, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, Amersham, 
5 Arlington Heights, IL) * All hybridizations were done using 
random hexamer labeled DNAs (Feinberg and Volgelstein, 
Anal. Biochem 1984 137, 266). Filters were prehybridized 
for at least 2 hours in 0.5 M sodium phosphate pH 7.2, 7% 
sodium dodecyl sulfate, and 1% BSA at 60° C. Hybridization 

10 of a minimum of 15 hours was in a solution of 0.5 M sodium 
phosphate pH 7.2, 7% sodium dodecyl sulfate, and 1% BSA at 
60° C. Hybridization filters were washed and 
autoradiographed (Sambrook et al. 1989) . 

Figure 3 is a diagram of the polymorphism in 

15 ein2-12 due to the loss of an EcoR I site. The pgEE1.2 
subclone from g3715 is shown. 

Figure 4 is a description of the EIN2 locus, the 
cDNA (bottom) is shown relative to the genomic map (top) . 
A putative TATA sequence is shown approximately 60 base 

20 pairs 5' to the start of the cDNA. The position of the 
translation start and stop sites are also shown. 

Figure 5 exhibits the sequence of the EIN2 locus. 
Genomic DNA sequence (SEQUENCE ID NO: 1) is shown in lower 
case letters, cDNA sequence (SEQUENCE ID NO: 2) is shown in 

25 capitol letters. The predicted peptide sequence (SEQUENCE 
ID NO: 3) is displayed under the corresponding nucleic acid 
co dons . 

Figure 6 is a schematic illustration of the 

ethylene biosynthesis pathway. 
30 Figure 7 depicts a seedling body and developing 

plant. Specifically, Figure 7A is a cross section of the 

seedling body of a seed plant. Figure 7B is a perspective 

view of a developing seed plant. 

Figure 8 identifies the protein sequences of 
35 eill, ein3, ei!2, ei!3, and a common consensus protein 

sequence representing all four of the individual protein 

sequences • 
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Figure 9 displays the EIN3 gene structure and 
mutants. Also set forth in Figure 9 is the predicted 
polypeptide acidity and basicity, as well as Asn repeats. 

Figure 10 exhibits a map of chromosome 3 and the 
5 position of EIN3 relative to other gene loci. 

Figure 11 sets forth a map of chromosome 2 and 
the position of EIL1 relative to other gene loci. 

Figure 12 displays a map of chromosome 5 and the 
position of EIL2 relative to other gene loci, 
10 Figure 13 exhibits a map of chromosome 4 and the 

position of HLS1 relative to other gene loci. 

Figure 14 is a representation of the arrangement 
of his mutants on chromosome 4. 

Figure 15 identifies the protein sequences of 
15 Arabldopsis HLS1 and acetyl transferases in E. coli, 
Pseudomonas, Streptomyces , Mouse, Human, Azospirillum, 
Yeast, and Citrobacter. A consensus sequence representing 
common amino acids of the sequences is also provided. 

Figure 16 displays ethylene responses in wild 
20 type and mutant: Ctrl, etol, hlsl, etrl, eiu2, ein3, 

Arabldopsis seedlings. Seeds of the indicated genotype 
were germinated and grown for three days in the dark in 
either air or air containing 10 ppm ethylene. 

Figure 17 is a genetic model of interactions 
25 among components of the ethylene signal transduction 

pathway. This model shows the predicted order in which the 
various gene products act which is based on the epistatic 
relationships among the mutants. The seedling ethylene 
responses are indicated on the right. 
30 Figure 18 is a representation of pNIiEIN3Bgl2 

indicating the relationship between the promoter, GUS, and 
EIN3 sequences . 

Figure 19 displays EIN3 sequences. Figure 19A 
sets forth EIN3 cDNA (SEQUENCE ID NO: 4), Figure 19B sets 
35 forth EIN3 genomic DNA (SEQUENCE ID NO: 5), and Figure 19C 
sets forth EIN3 protein sequence (SEQUENCE ID NO: 6). 
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Figure 20 displays EIL1 sequences. Figure 20A 
sets forth EIL1 cDNA (SEQUENCE ID NO: 7), Figure 20B sets 
forth EIL1 peptide sequence (SEQUENCE ID NO: 8). 

Figure 21 displays EIL2 sequences. Figure 21A 
5 sets forth EIL2 cDNA (SEQUENCE ID NO: 9), Figure 21B sets 
forth EIL2 peptide sequence (SEQUENCE ID NO: 10) . 

Figure 22 displays EIL3 sequences. Figure 22A 
sets forth EIL3 cDNA (SEQUENCE ID NO: 11). EIL3 peptide 
sequence is set forth in SEQUENCE ID NO: 12. 
10 Figure 23 displays HLS1 sequences. Figure 23A 

sets forth HLS1 cDNA (SEQUENCE ID NO: 14), Figure 23B sets 
forth HLS1 genomic DNA sequence (SEQUENCE ID NO: 15), and 
Figure 23 C sets forth HLS1 peptide sequence. 

DETAILED DESCRIPTION OF THE INVENTION 

15 The present invention is directed to nucleic acid 

and amino acid sequences which lend valuable 
characteristics to plants. 

The present invention is directed to nucleic acid 
sequences of the EIN2 locus. Wild type and mutant 

20 sequences of EIN2 are within the scope of the present 

invention. Amino acid and protein sequences corresponding 
to the nucleic acid sequences are included in the present 
invention. EIN2 mutations provide for ethylene 
insensitivity and pathogen tolerance in plants. 

25 SEQUENCE ID NO: 2, the isolated cDNA representing 

the nucleic acid sequence coding for EIN2 and the isolated 
genomic EIN2 sequence of SEQUENCE ID NO: 1 are embodiments 
of the present invention. The purified amino acid sequence 
of SEQUENCE ID NO: 3 represents the EIN2 protein product 

30 encoded by the cDNA identified above. The EIN2 mutations 
identified herein by nucleotide position are measured in 
accordance with the beginning of the cDNA. 

An ein2-3 mutation was created by X-ray 
mutagenesis which resulted in a thymidine insertion at 

35 nucleotide position 3642 of the cDNA sequence in SEQUENCE 
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ID NO: 2. A frameshift results in the corresponding amino 
acid sequence. 

An ein2-4 mutation was also generated by X-ray 
mutagenesis. The ein2-4 mutation has an n AG" to "TTT" 
5 mutation at position 2103 of the EIN2 cDNA sequence 

resulting in a frameshift in the corresponding amino acid 
sequence . 

An ein2-5 mutation was generated by X-ray 
mutagenesis, such that a deletion beginning at nucleic acid 

10 position 1570 of the cDNA occurred. Nucleic acids CATGACT 
were deleted. A frameshift results in the corresponding 
protein product. 

An ein2-6 mutation has a deletion of nucleic 
acids GAGTTGCGCATG, SEQ ID NO: 17, beginning at nucleic 

15 acid position 965 of the cDNA sequence. The eln2-6 

mutation was generated by Agrobacterium mutagenesis* This 
mutation results in a deletion at the amino acid level of 
Gly-Val-Ala-His, SEQ ID NO: 18, formerly beginning at amino 
acid position 115. 

20 Another mutation, ein2-9 was generated by DEB 

mutagenesis and has an "A" to M C n transition at position 
4048 that results in a "His" to "Pro" change at amino acid 
position 1143 in the corresponding protein. 

elu2-ll was generated by DEB mutagenesis and has 

25 a W TG" to n AT n transition at nucleic acid position 3492. 

This results in an Ochre stop signal at amino acid position 
957 in the protein. 

An ein2-12 mutation was obtained by X-ray 
mutagenesis resulting in a deletion at nucleic acid 

30 position 1611 of nucleic acids TGCTACAATCAGAATTCTTGCAGT, 
SEQ ID NO: 19. The corresponding amino acid sequence 
reveals a deletion of amino acids Ala-Thr-Ile-Arg-Ile-Leu- 
Ala-Val, SEQ ID NO: 20, beginning a£ amino acid position 
331. 

35 An e±n2-16 mutation results in an n AGT n to "G" 

transition at nucleic acid position 2851 as a result of X- 
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ray mutagenesis. A frameshift results in the corresponding 
protein. 

Table 4 sets forth the EIN2 alleles and the 
results of the mutagenesis. 
5 Ein3 sequences for genes and proteins are the 

subject of the present invention. The present invention is 
directed to wild type nucleic acid and amino acid sequences 
as well as mutations of these sequences. EIN3 mutations 
result in ethylene insensitive plants. Bin-like genes and 

10 protein sequences, including eill, ei!2, and ei!3 

sequences, are similar to ein3 sequences, and are also 
disclosed in the present invention. The EIN3 mutations are 
identified below by nucleotide position number in 
accordance with the beginning of the genomic DNA sequence. 

15 The DNA sequences coding for ein3 are set forth 

in SEQ ID NOS: 5 (genomic) and 4 (cDNA) . The amino acid 
sequence may be found in SEQ ID NO: 6. 

In ein3-2, a n G w to "A" conversion in the genomic 
' DNA at nucleotide 1598 occurs as a result of EMS 

20 mutagenesis. In the corresponding protein, w W n is changed 
to a stop codon at amino acid position 215. The ein3-2 
mutation was generated by T-DNA insertion mutagenesis. The 
T-DNA inserted after nucleotide 2001 of the genomic, 
interrupting the protein after amino acid 349. The e±n3-3 

25 mutation results in a "G" to "T" switch at nucleotide 
position 1688 of genomic DNA as a result of DEB 
mutagenesis. The amino acid sequence results in a 
conversion of W K M to "N n at amino acid position 245. 

The cDNAs of eill, eil2 , and eiI3, are set forth 

30 in SEQ ID NOS: 7, 9, and 11, respectively. The 

corresponding amino acid sequences for the ein-like genes - 
are set forth in SEQ ID NOS: 8, 10, and 12, {eill, ei!2 , 
and ei!3, respectively) . A consensus sequence representing 
the common codons of the three ein-like genes is SEQ ID NO: 

35 13. 

Table 6 sets forth the EIN3 alleles and the 
results of the mutagenesis. The translation start site of 
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EIN3 is at nucleotide position 954 of the genomic sequence, 
the translation start sites for EIL1, EIL2, and EIL3 are at 
nucleotide positions 251, 8, and 102 of the respective cDNA 
sequences. 

5 The present invention is directed to wild type 

and mutant sequences for the Hlsl locus. The his gene is 
regulated by ethylene directly. Amino acid and protein 
sequences corresponding to the wild type and mutant gene 
for Hlsl are within the scope of the present invention. 

10 The present invention is directed to nucleic acid 

sequences of the HLS1 locus. Wild type and mutant 
sequences of KLS1 are within the scope of the present 
invention. Amino acid and protein sequences corresponding 
to the nucleic acid sequences are included in the present 

15 invention. The HLS1 mutations are identified below by 

nucleotide position number in accordance with the beginning 
of the genomic DNA sequence. 

SEQUENCE ID NO: 14, the isolated cDNA 
representing the nucleic acid sequence coding for HLS1, and 

20 the isolated genomic HLS1 sequence of SEQUENCE ID NO: 15 
are embodiments of the present invention. The purified 
amino acid sequence of SEQUENCE ID NO: 16 represents the 
HLS1 protein product encoded by the cDNA identified above. 

An hlsl-1 mutation was created by EMS mutagenesis 

25 which resulted in a "G tt to "A" transition at nucleotide 

position 3487 of the genomic DNA sequence. This frameshift 
results in the corresponding amino acid sequence having a 
n Glu w to "Lys" substitution at amino acid position 345. 

An hlsl-5 mutation of was generated by DEB 

30 mutagenesis. The hlsl-5 mutation has an n T n to "A" 

mutation at position 2194 of the HLS1 genomic DNA sequence, 
resulting in a mutation in the splice donor site. An hlsl- 
7 mutation was also created by DEB aLnd resulted in a "T" to 
w A n transition at nucleic acid position 2194. The result 

35 in the amino acid sequence is also a mutation in the splice 
donor site. Mutations at splice donor sites often result 
in aberrant splicing causing a frameshift or insertion to 
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occur. The exact nature of the change in hlsl-5 and hlsl-7 
may be determined by analyzing the protein from those 
mutants using an antibody. 

hlsl-6 is a mutation created by EMS resulting in 
5 a "T" to "G" transition at nucleic acid position 3431. The 
corresponding amino acid sequence has a "Lys" to n Trp n 
substitution at amino acid position 326. 

The mutation hlsl-4 was created by DEB 
mutagenesis resulting in a n G" to "A" transition at nucleic 
10 acid position 3487. The corresponding amino acid sequence 
has a "Glu" to "Lys" change at amino acid position 345. 

hlsl-9 is created by EMS mutagenesis. The 
sequence results in w C n to "T tt at nucleic acid position 
2 060, which corresponds to an "Arg" to "TGA" creating a 
15 "stop signal" at amino acid position 11. 

hlsl-8 is a mutation resulting from EMS 
mutagenesis. The nucleic acid sequence has a "C" to "T" 
change at position 2992. The mutation results in an amino 
acid sequence having an "Arg" to "Stop" transition at amino 
20 acid position 180. 

An EMS mutation resulting in a "G" to "A" change 
at nucleic acid position 2033 is represented by his! -20. 
The amino acid sequence corresponding to the mutation 
reveals a "Met" (Start signal) to "lie" transition at amino 
25 acid position 1. 

Table 7 sets forth the HLS1 alleles and the 
results of the mutagenesis. 

In accordance with the present invention, nucleic 
acid sequences include and are not limited to DNA, 
30 including and not limited to cDNA and genomic DNA; RNA, 
including and not limited to mRNA and tRNA; and suitable 
nucleic acid sequences such as those set forth in SEQUENCE 
ID NUMBERS set forth herein, and alterations in the nucleic 
acid sequences including alterations, deletions, mutations 
35 and homologs. In addition, mismatches within the sequences 
identified above, which achieve the methods of the 
invention, are also considered within the scope of the 
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disclosure. The sequences may also be unmodified or 
modified. 

Also amino acid, peptide and protein sequences 
within the scope of the present invention include, and are 
5 not limited to, the sequences set forth herein and 
alterations in the amino acid sequences including 
alterations, deletions, mutations and homologs. 

In accordance with the invention, the nucleic 
acid sequences employed in the invention may be 

10 exogenous/heterologous sequences. Exogenous and 

heterologous, as used herein, denotes a nucleic acid 
sequence which is not obtained from and would not normally 
form a part of the genetic make-up of the plant or the cell 
to be transformed, in its untransf ormed state. Plants 

15 comprising exogenous nucleic acid sequences of ein2, ein3, 
elll, eil2, e±13, or hlsl mutations, such as and not 
limited to the nucleic acid sequences of SEQUENCE ID 
NUMBERS set forth herein are within the scope of the 
invention. 

20 Transfected and/or transformed plant cells 

comprising nucleic acid sequences of ein2, ein3, elll, 
e±12, e£13, or hlsl mutations, such as and not limited to 
the nucleic acid sequences of SEQUENCE ID NUMBERS set forth 
herein, are within the scope of the invention. Transfected 

25 cells of the invention may be prepared by employing 

standard transfection techniques and procedures as set 
forth in Sambrook et al.. Molecular Cloning: A Laboratory 
Manual, 2nd ed., 1989, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, hereby incorporated by reference in 

30 its entirety. 

In accordance with the present invention, mutant 
plants which may be created with the sequences of the 
claimed invention include higher and lower plants in the 
Plant Kingdom. Mature plants and seedlings are included in 

35 the scope of the invention. A mature plant includes a 

plant at any stage in development beyond the seedling. A 
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seedling is a very young, immature plant in the early 
stages of development. 

Particularly preferred plants are thoBe from: 
the Family Umbellif erae, particularly of the genera Daucus 
5 (particularly the species carota, carrot) and Aplum 

(particularly the species graveolens dulce, celery) and the 
like; the Family Solanacea, particularly of the genus 
Ly copers icon, particularly the species esculen turn (tomato) 
and the genus Solanxim, particularly the species tuberosum 

10 (potato) and melongena (eggplant) , and the like, and the 

genus Capsicum, particularly the species annum (pepper) and 
the like; and the Family Leguminosae, particularly the 
genus Glycine, particularly the species max (soybean) and 
the like; and the Family Cruciferae, particularly of the 

15 genus Brassica, particularly the species campestris 
(turnip), oleracea cv Tastie (cabbage), oleracea cv 
Snowball Y (cauliflower) and oleracea cv Emperor (broccoli) 
and the like; the Family Compositae, particularly the genus 
Zacfcuca, and the species sativa (lettuce), and the genus 

20 Arabidopsis, particularly the species thaliana (Thale 

cress) and the like. Of these Families, the most preferred 
are the leafy vegetables, for example, the Family 
Cruciferae, especially the genus Xrahidopais, most 
especially the species thaliana. 

25 Ein2 mutant sequences render plants disease and 

pathogen tolerant, and ethylene insensitive. For purposes 
of the current invention, disease tolerance is the ability 
of a plant to survive infection with minimal injury or 
reduction in the harvested yield of saleable material. 

30 Plants with disease tolerance may have extensive levels of 
infection but have little necrosis and few to no lesions. 
These plants may also have reduced necrotic and water 
soaking responses and chlorophyll loss may be virtually 
absent. In contrast, resistant plants generally limit the 

35 growth of pathogens and contain the infection to a 

localized area with multiple apparent injurious lesions. 
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The current invention is directed to, for 
example, identifying plant tolerance to bacterial 
infections including, but not limited to Clavibacter 
michiganenae (formerly Coynebacterium michiganense) # 
5 Paeudomonas solanacearum and Erwinia stevartii, and more 
particularly, Jfanthomonas campeatris (specifically 
pathovars campestris and vesica tori a) , Pseudomonas syringae 
(specifically pathovars tomato, maculicola) . 

In addition to bacterial infections, disease 

10 tolerance to infection by other plant pathogens is within 
the scope of the invention. Examples of viral and fungal . 
pathogens include, but are not limited to tobacco mosaic 
virus, cauliflower mosaic virus, turnip crinkle virus, 
turnip yellow mosaic virus; fungi including Phyfcophthora 

15 infestans, Peronospora parasitica, Bhizoctonia solani, 
Botrytls cinerea, Phoma ling am (Leptosphaeria maculans) , 
and Albugo Candida. 

Like ein2, ein3 mutants also exhibit ethylene 
insensitivity. However, ein3 mutants do not exhibit 

20 disease or pathogen tolerance. Ethylene, CH^CH^, is a 

naturally occurring plant hormone. The ethylene regulatory 
pathway includes the ethylene biosynthesis pathway and the 
ethylene autoregulatory or feedback pathway, see Figure 6. 
In the ethylene biosynthesis pathway, methionine is 

25 converted to ethylene with S-adenosylmethionine (SAM) and 
1-aminocyclopropane-l-carboxylic acid (ACC) as 
intermediates. These two reactions are catalyzed by ACC 
synthase and ethylene -forming enzyme (EFE) , respectively. 
Little is known about the enzymes catalyzing these 

30 reactions and their regulation at the molecular level. 

The receptor and receptor complex of Figure 6 are 
believed to function with the autoregulatory pathway in the 
control of ethylene production. Ethylene regulatory 
pathway inhibitors are positioned along the left side of 

35 Figure 6. The inhibitors include AVG ( amino ethoxyvinyl- 
glycine) and AIB (a-aminoisobutyric acid) . The steps at 
which the mutants, ethylene overproducer (etol) , ethylene 
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insensitive (einl, ein2) and bookless (hlel) , are defective 
appear on the right of Figure 6. 

In accordance with the claimed invention, 
ethylene insensitive plants are those which are unab le to 
5 display a typical ethylene response when treated with high 
concentrations of ethylene. For purposes of the present 
invention, ethylene insensitivity includes total or partial 
inability to display a typical ethylene response. A 
typical ethylene response in wild type plants includes, for 

10 example, the so-called "triple response" which involves 

inhibition of root and stem elongation, radial swelling of 
the stem, and absence of normal geotropic response 
(diageotropism) . Thus, for example, ethylene insensitive 
plants may be created in accordance with the present 

15 invention by the presence of an altered "triple response" 
wherein the root and stem are elongated despite the 
presence of high concentrations of ethylene. Further, a 
typical ethylene response also includes a shut down or 
diminution of endogenous ethylene production, upon 

20 application of high concentrations of ethylene. Ethylene 
insensitive plants may thus also be screened for, in 
accordance with the present invention, by the ability to 
continue production of ethylene, despite administration of 
high concentrations of ethylene. Such ethylene insensitive 

25 plants are believed to have impaired receptor function such 
that ethylene is constitutively produced despite the 
presence of an abundance of exogenous ethylene. 

Screening includes screening for root or stem 
elongation and screening for increased ethylene production. 

30 Ethylene sensitive wild type plants experience an 

inhibition of root and stem elongation when an inhibitory 
amount of ethylene is administered. By inhibition of root 
and stem elongation, it is meant that the roots and stems 
grow less than the normal state (that is, growth without 

35 application of an inhibitory amount of ethylene) . 

Typically, normal Arabidopsis (Col) grown without ethylene 
or ethylene precursor aminocyclopropane, ACC, root 
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elongation is about 6.5 + 0.2 mm/3 days; normal stem 
elongation is 8.7 + 0.3 mm/3 days. Ein 2-1 plants grown 
without ethylene or ACC have root elongation of about 7.5 + 
0.2 mm/3 days and stem elongation of 11.35 + 0.3 mm/3 days. 
5 In the presence of 100 paa ACC, Col root growth is 1.5 + 
0.04 mm/3 days; ein 2-1 is 4.11 + 0.1 mm/3 days and stem 
growth of 3.2 + 0.1 mm/3 days for Col and 8.0 ± 0.2 mm/3 
days for ein 2-1. Alternatively, plants may be sprayed 
with ethaphon or ethrel. By roots, as used here, it is 

10 meant mature roots (that is, roots of any plant beyond the 
rudimentary root of the seedling) , as well as roots and 
root radicles of seedlings. Stems include hypocotyls of 
immature plants of seedlings and stems, and plant axes of 
mature plants (that is, any stem beyond the hypocotyl of 

15 seedlings) . See Figure 7A and Figure 7B. 

Ethylene sensitive wild type plants experience a 
shut down or diminution of endogenous ethylene production, 
upon application of high concentrations of ethylene. In 
the ethylene insensitive plants of the present invention, 

20 the plants continue endogenous production of ethylene, 

despite administration of inhibitory amounts of ethylene. 
Ethylene production for wild type and ethylene insensitive 
mutants are shown in Table 1. An ethylene insensitive 
plant will produce an amount or have a rate of ethylene 

25 production greater than that of a wild type plant upon 

administration of an inhibitory amount of ethylene. As one 
skilled in the art will recognize, absolute levels of 
ethylene produced will change with growth conditions. 

Einl and ein2 mutants are described for example 

30 in, Guzman et al., "Exploiting the Triple Response of 
Arabidopsis to Identify Ethylene-Related Mutants", The 
Plant Cell 1990, 2, 513, the disclosures of which are 
hereby incorporated herein by reference, in their entirety. 

The present invention is further described in the 

35 following examples. These examples are not to be construed 
as limiting the scope of the appended claims. 
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EXAMPLE 1 

PRODUCTION OF Arabidopsis MUTANTS 

The production of plants which exhibit enhanced 
disease tolerance and ethylene insensitivity were 
5 investigated with the use of Arabidopsis mutants ein, which 
are insensitive to ethylene and are derived from 
Arabidopsis Col-0. The ein mutants were prepared according 
to the method of Guzman et al.. The Plant Cell, 1990, 2, 
513, the disclosures of which are hereby incorporated 

10 herein by reference, in their entirety. Specifically, 

twenty five independent ethylene-insensitive mutants were 
isolated; six mutants which showed at least three- fold 
difference in the length of the hypocotyl compared with 
ethylene- treated wild- type hypocotyl, were further 

15 characterized. In these mutants, the apical hook was 
either present, absent or showed some curvature in the 
apical region. The appearance of the apical curvature was 
dependent on the duration of the incubation. After more 
than 3 days of incubation in the dark with 10 /xl/L 

20 ethylene, the apical curvature was absent. This phenotype 
was named "ein" for ethylene insensitive. 

Mendelian analysis indicated that insensitivity 
to ethylene was inherited as either a dominant or recessive 
trait depending on the mutation studied. Comple m entation 

25 analysis was performed with five recessive mutants to 

determine whether more than one locus was involved in this 
phenotype. The results of these studies indicated that all 
five recessive mutations were allelic. The ein phenotype 
was tested for linkage to nine visible markers to determine 

30 whether the recessive and dominant ein mutations were 

allelic. The dominant ein mutation was mapped close to the 
mutation ap-1 locus on chromosome 1 and was named einl-1. 
None of the nine markers showed linkage to the recessive 
ein mutation. Restriction fragment length polymorphism 

35 (RFLP) analysis was performed to map this mutation. 

Randomly selected RFLP probes were initially used to assess 
linkage. After testing probes from three different 
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chromosomes, linkage was detected to one RFLP from 
chromosome 4 and named ein2-l. This observation was 
confirmed using additional RFLP probes from the same 
chromosome. Further experimentation confirmed ein2-2, 
5 ein2-3, ein2-4 and ein2-5 to be alleles of ein2-l. 

Growth features of ethylene insensitive mutants 
were also observed. After seedlings were planted in soil 
and cold treated at 4°C for 4 days, the seedlings were 
incubated in the dark at 23 °C for 66-72 hours. Plants were 

10 grown to maturity in a growth chamber at 22 °C to 25 °C under 
continuous illumination with fluorescent and incandescent 
light. The rosette of einl-1 and ein2-l plants was larger 
compared with the wild type, Col-0, rosette and a delay in 
bolting (1 cm to 2 cm growth in the length of the stem) was 

15 observed. These observations indicated that the ethylene 
insensitive mutations identified at the seedling stage 
exerted remarkable effects during adult stages of growth. 

eto mutants, which constitutively produce 
ethylene, were initially screened by observing a 

20 constitutive triple response; seedlings with inhibition of 
hypocotyl and root elongation, swelling of the hypocotyl 
and exaggerated tightening of the apical hook. Mendelian 
segregation analysis determined the genetic basis of these 
mutations to be a single recessive mutation and identified 

25 as an ethylene overproducer or eto. 

etol, einl and ein2 mutants were analyzed to 
determine ethylene accumulation. The mutants were 
backcrossed to the wild type before physiological 
examination. Surf ace- sterilized seeds (about 500) were 

30 germinated and grown for 66 to 72 hours in the dark at 23 °C 
in 20 ml gas chromatograph vials containing 15 ml of growth 
medium . 

To measure the conversion vof exogenous 1- 
aminocyclopropane-l-carboxylic acid (ACC, an intermediate 
35 in ethylene production) to ethylene, seedlings were grown 
in 1% low- melting-point agarose buffered with 3 mM Mes at 
pH 5.8. In this solid support no chemical formation of 
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ethylene from ACC was detected at any of the concentrations 
of ACC employed. 

Ethylene accumulation from tissues of mature 
plants (100 mg) was measured after overnight incubation in 
5 20 ml gas chromatograph vials. Leaves and inflorescence 
were taken from 24-28 day old plants, siligues from 32-36 
day old plants. Accumulation of ethylene was determined by 
gas chromatography using a photo- ionization detector (HNU) 
and a Hewlett Packard HPS 8 9 OA gas chromatograph equipped 

10 with an automated headspace sampler. A certified standard 
of 10 /xl/L ethylene (Airco) was used to calculate ethylene 
concentrations. The concentration of the inhibitors of 
ethylene biosynthesis and ethylene action was determined 
empirically. For eto mutants, AVG, a-aminoisobutyric acid, 

15 and AgN0 3 supplemented the media at 5/iM, 2mM and 0.1 mM, 
respectively and trans -cyclooctene (17/tl/L) was injected 
into the vial after the cold treatment. Ethylene 
production was increased significantly in the dominant 
einl-1 mutant and the recessive ein2-l mutant, see Table 1* 

20 Ethylene production was inhibited in etol-1 seedlings that 
were grown in media supplemented with ethylene inhibitors 
aminoethoxyvinylglycine, AGV and a-aminoisobutyric acid, 
A1B, see Table 1. 

The EIL sequences represent cDNA sequences 

25 similar to the EIN3 sequence. They were obtained by 

screening an Arabidopsls seedling cDNA library (Rieber et 
al., Cell, 1993, 72, 427-441, at low stringency in the 
following manner. The cDNA library was hybridized with the 
radiolabeled EIN3 cDNA insert at 42° C for 48 hours in a 

30 hybridization solution consisting of 30% formamide, 5X 

Denhardt's solution, 0.5% SDS, 5X SSPE, 0.1 mg/ml sheared 
salmon sperm DNA, according to the methods of Feinberg and 
Vogelstein, Anal. Biochem. 1984, 177, 266-267, 
incorporated herein by reference in its entirety. The 

35 filters were washed at 42° C with 30% formamide, 0.5% SDS, 
5X SSPE; followed by 2X SSPE. 
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Mutageneized HLS1 plants were obtained as set 
forth above for EIN2 , EIN3 , and EIL. 



Table 1 

Ethylene Production in Triple Response Mutants 



Strain 


Ethylene Accumulation 


Wild Type 




Etiolated Seedlings 


€.7 + 0.68 nL 


Light -grown Seedlings 


84.25 + 13.95 nL 


Leaves 


73.01 + 17.64 nL/g 


Siligues 


144.96 ± 28.99 nL/g 


Inflorescence 


234.53 + 18.04 nL/g 


etol-1 




Etiolated Seedlings 


276.72 + 53.70 nL 


L i gh t - Grown S e edl ings 


182.01 + 24.84 nL 


Leaves 


174.39 + 29.18 nL/g 


Siliques 


322.16 + 38.66 nL/g 


Inflorescence 


1061.84 + 72.16 nL/g 


hlsl-1 




Etiolated seedlings 


5.81 ± 0.32 nL 


Leaves 


31.56 ± 0.32 nL 


einl -1 




Etiolated Seedlings 


12.73 ± 2.79 nL 


Leaves 


222.95 + 2.79 nL 


ein2-l 




Etiolated Seedlings 


20.69 ± 2.09 nL 


Leaves 


135.59 ± 26.89 nL/g 
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Another ethylene insensitive mutant of 
Arabidopsis thaliana was designated etr by Bleecker et al. 
in "Insensitivity to Ethylene Conferred by a Dominant 
Mutation in Arabidopsis thaliana", Science 1990, 241, 1086, 
5 the disclosures of which are hereby incorporated herein by 
reference, in their entirety. Etr was identified by the 
ethylene -mediated inhibition of hypocotyl elongation in 
dark-grown seedlings. Populations of Mj generation from 
mutagenized seed of Arabidopsis thai i ana were plated on a 

10 minimal medium solidified with 1% agar and placed in a 
chamber through which 5 /xl/L ethylene in air was 
circulated. Seedlings that had grown more than 1 cm after 
4 days were selected as potential ethylene insensitive 
mutants. A screen of 75,000 seedlings yielded three mutant 

15 lines that showed heritable insensitivity to ethylene. 

Hypocotyl elongation of etr mutant line was unaffected by 
ethylene at concentrations of up to 100/il/L, while 
elongation of the wild type was inhibited by 70% with 
ethylene at 1 /il/L* 

20 EXAMPLE 2 

CLONING AND SEQUENCING OF EIN2 

The EIN2 locus was identified by a mapped based 
cloning strategy described as follows. The ein2-l mutant 
was crossed onto the DP28 marker line (disl, clv2, er, tt5) 

25 according to the methods of Koornneef and Stamm, Methods in 
Arabidopsis Research, eds. C. Koncz, N-H Chua, and J. 
Schell, 1992, World Scientific Publishing Co., Singapore, 
incorporated herein by reference in its entirety. The F2 
progeny were mapped with Restriction Fragment Length 

30 Polymorphisms (RFLPs) according to the methods of Chang et 
al., Proc. Natl Acad. Sci. USA 1988, 85, 6856 and Nam et 
al., Plant Cell 1990, 1, 699, the disclosures of which are 
hereby incorporated by reference in their entirety. 

The ein2-l mutation was found to segregate with 

35 RFLPs on the top of chromosome five (Table 2) . Two 

recombinant progeny found with X217 (E15 and E54) were also 
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recombinant with the more proximal g3837 and X291 clones, 
indicating that ein2-l is distal to X217. Recombinant 
plants were identified by examining F 3 families from the 
ein2-l x DP28 cross for the genotype at the X217 locus. 
5 Protocols are the same mapping with RFLPs. Recombinants 
were defined by having at least one recombinant chromosome 
in an ein2-l homozygote. The Ubq6121 marker, however, 
identified a different F2 progeny (E46) as being 
recombinant. This positions ein2 within the interval of 

10 X217 and Ubq6121. To further limit the position of ein2 on 
the top of chromosome 5, recombinants were sought with the 
PCR based marker ATHCTR1, Bell et al., Methods in Plant 
Molecular Biology: A Laboratory Manual, 1993, eds. Maliga, 
Klessig, and Cashmore, Cold Spring Harbor Laboratory Press, 

15 the disclosure of which is hereby incorporated by reference 
in its entirety. 

A single recombinant progeny was identified in 
102 F2 progeny scored. This F2 progeny was also 
recombinant at the proximal X217 and ASA1 markers, 

20 demonstrating the position of e±n2 as distal to ATHCTR1 , 
Additional genetic information was generated by examining 
recombinant progeny from a cross between ein2-2 and hyS. 
Two additional recombination events between e±n2-l and 
ATHCTR1 were identified by this approach. There were no 

25 recombinant plants identified at the g3715 locus, a cosmid 
clone identified in Nam et al., supra. 



v. 
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Characterization of Plants Having ein2 Mutation 





ALLELE 


HYPOCOTYL 


SE 


ROOT 


SE 


TL 


SE 




Columbia 


3.6 


0.2 


1.6 


0.1 


5.2 


0.2 


5 


Landsberg 


3.2 


0.1 


1.7 


0.1 


4.9 


0.2 




Wassilewskija 


2.7 


0.1 


0.9 


0.1 


3.6 


0.1 




e±n2-l * 


6.0 


0.3 


7.1 


0.1 


13.1 


0.4 




e±n2-3 * 


8.2 


0.2 


5.9 


0.3 


14.1 


0.4 




e±n2-4 * 


7.5 


0.2 


6.3 


0.4 


13.8 


0.5 


10 


eln2-5 * 


8.4 


0.2 


7.2 


0.5 


15.6 


0.5 




e±n2 - 6 


8.8 


0.4 


5.4 


0.2 


14.2 


0.5 




exn2-7 


5.9 


0.1 


3.8 


0.1 


9.7 


0.2 




ein2~9 


7.3 


0.2 


5.5 


0.2 


12.8 


0.3 




e±n2-10 


6.4 


0.1 


4.7 


0.4 


11.1 


0.5 


15 


ein2-ll 


8.1 


0.1 


7.7 


0.3 


15.8 


0.4 




e±n2-12 


6.5 


0.3 


4.4 


0.3 


10.9 


0.4 




e±n2-13 


5.4 


0.2 


3.7 


0.2 


9.1 


0.4 




e±n2-15 


6.9 


0.5 


5.3 


0.4 


12.2 


0.9 




eln2-16 


8.1 


0.3 


7.7 


0.6 


15.8 


0.7 


20 


eln2~18 + 


6.2 


0.2 


6.5 


0.4 


12.7 


0.4 




e±n2-19 + 


7.1 


0.2 


6.2 


0.5 


13.3 


0.6 




ein2-20 + 


5.8 


0.2 


5.2 


0.2 


11.0 


0.3 



All units are in mm, TL = Total Length, SE = Standard Error 
* Guzman and Ecker, Plant Cell 1990, 2, 513. 
25 + Gift of Caren Chang and Elliot Meyerowitz, Pasadena, CA. 
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The flanking genetic markers were used to build a 
Yeast Artificial Chromosome (YAC) physical contig spanning 
the ein2 locus (Figure 1) . The YAC positions were 
identified by colony hybridization pursuant to the 
5 technique of Matallana, et al., Methods in Arabidopsxe 

Research, eds C- Koncz, N-H Chua, and J Schell, 1992, World 
Scientific Publishing Co., Singapore, the disclosures of 
which are hereby incorporated by reference in their 
entirety. 

10 YAC clones are replicated in the yeast cells as 

authentic chromosomes and so they are present as only one 
copy per cell. This is an important difference with 
bacterial colony hybridization and makes colony filter 
treatment a critical step for successful sequence 

15 detection. After growing colonies overnight on the 

filters, the cell walls were digested and the spheroplasts 
were lysed in order to prepare yeast DNA for hybridization. 

Yeast cell wall digestion is stimulated by 
reducing agents, such as 2-mercaptoethanol or DTT, that 

20 modify the wall structure and make it more sensitive to 
enzymatic action. Colony filters were placed on filter 
paper soaked in 0.8% DTT in SOE buffer (1 M sorbitol, 20 zDM 
EDTA, 10 mM Tris -acetate pH 8.0) for 2-3 min. before 
transferring them to filter paper soaked in SOE containing 

25 1% 2-mercaptoethanol and 1 mg/ml Zymolyase 10-T in 

individual 150 X 15 mm petri dishes. Petri dishes were 
paraf ilmed and stacked in a sealed plastic bag and 
incubated at 37° C overnight. 

After spheroplasting, lysis was carried out by 

30 placing the filters on whole sheets of Whatman 3 MM paper 
soaked in the appropriate solution. The 3 MM sheets were 
placed on Saran wrap and soaked immediately before use. 
The filters were treated as follows:. 

1. 10% SDS-for 10 min.; 

35 2. 0.5 M NaOH for 10 min (1.5 NaCl should be 

included for Hybond N+) ; Repeat; 

3 . Air dry for 5 min . ; 
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4. 1 M Tris-HCl (pH 7.6), 1.5 M NaCl for at 
least 5 min; 

5. 0.1 M Tris-HCl (pH 7.6), 0.15 M NaCl for at 
least 5 min. Cell debris on the filters was eliminated by 

5 gently wiping the filters with Kimwipes soaked in the same 
solution. 

6. 2xSSPE for at least 5 min. This step 
precedes hybridization. Following lysis, the filters are 
air dried for 30 min. and baked for 2 hours at 80 C. 

10 The left ends of the identified YAC clones were 

isolated by plasmid rescue according to Bell et al., 1994. 
Right ends were isolated by either vectorette PCR according 
to the methods of Matallana, et al., 1992, supra, or 
inverse PCR as described by Bell, et al., 1994, supra, the 

15 disclosures of which are hereby incorporated by reference 
in their entirety. The yUP library appeared to be missing 
clones corresponding to ATHCTR1; three clones hybridizing 
to this locus were found within the EG library (Grill and 
Somerville, Hoi. Gen. Genet. 1991, 226, 484, incorporated 

20 herein by reference in its entirety.) The pEG23G5L left 
end plasmid rescue hybridizes to useful EcoR I and Xba I 
polymorphisms and hybridizes to the same lambda clone as 
ATHCTR1 (Xctg24; Kieber et al.. Cell 1993, 72, 427, 
incorporated herein by reference in its entirety) . The 

25 left end rescue pytJP2GHL hybridizes to EG23G5, linking the 
Hbq6121/g3715 and ATHCTR1 clones into a contiguous array. 
pyUP2GHL also contains a Bgl XX polymorphism that is 
informative in the ein2-l X DP28 cross. The three plants 
that are recombinant at ATHCTR1 are also recombinant at 

30 pyUP2GHL; this indicates the position of ein2 is distal to 
this YAC end (Figure 1) . 

To facilitate the identification of the ein2 
locus, 24 alleles were identified (Table 1; Guzman and 
Ecker, Plant Cell 1990, 2, 513, incorporated herein by 

35 reference in its entirety.) Many of these alleles were 
generated by X-ray or diepoxybutane mutagenesis; these 
mutagens are known to create polymorphisms that are 
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detectable by hybridization to a genomic Southern blot 
(Clark, et al., Genetics 1986 f 112, 755; Reardon et al.. 
Genetics 1987, 115, 323, incorporated herein by reference 
in their entirety) - EcoR I, HinD III, BamH I, Bgl II, and 
5 Sal I genomic Southern blots, were made to find such a 

polymorphism in the mutant alleles of ein2. The following 
probes that mapped between Ubg6121 and yUP2GHL were 
hybridized to the genomic allele blots: Ubq6121, EG19A10L, 
yUP2GHR, g3715, yUP19EHL, EG23G5R, and yTJP2GllL. The 

10 cosmid clone g3715 hybridized to a restriction fragment 
length polymorphism in ein2-12 that corresponds to a lost 
EcoR I site (Figure 2) . Based on this missing EcoR I site, 
this region was examined further. 

The 1.2 kb EcoR I fragment that corresponds to 

15 one of the missing bands in ein2-12 was sub cloned from 
g3715 into pKS (Stratagene, LaJolla, CA) this clone is 
named pgEE1.2 (Figure 3) . The pgEE1.2 insert was used to 
isolate 22 cDNA clones made from ethylene treated three-day 
old etiolated Arabidopsis thaliana seedlings (Kieber, et 

20 al. 1993, supra*) pgEE1.2 was also used to identify a 

single genomic lambda clone, XgE2, from a XDASH II library 
made from adult Columbia plants. The XgE2 clone spanned 
the 5' end of the locus and terminated within the 3' end of 
the cDNA. Initially the pcE2.5 clone was sequenced but 

25 since this clone was not full length, the 5' ends of 
pcE2.17, pcE2.20, and pcE2 . 22 (Kieber, et al. 1993) 
were sequenced to determine the structure of the full 
length frame and ending within 60 bp from a putative "TATA" 
box (Figure 4) . Using 5 jig of poly(A+) RNA from 3 -day old 

30 dark-grown, ethylene- treated Arabldopsis seedlings 

(hypocotyls and cotyledons) as template and oligo(dT) as 
primer, first- strand cDNA synthesis was catalyzed by 
Moloney murine leukemia virus reverse transcriptase 
(Pharmacia) for construction of the Arabldopsis cDNA 

35 expression library. Second- strand cDNA waB made as 

described by Gubler and Hoffman, Gene 1983, 25, 263, which 
is hereby incorporated by reference in its entirety, except 



WO 95/35318 



PCT/US95/07744 



- 29 - 

that E. coll DNA ligas was omitted. After the Becond- 
strand reaction, the ends of the cDNA were made blunt with 
Klenow fragment, and EcoR I-Not I adaptors (Pharmacia) were 
ligated to each end. The cDNA was purified from unli gated 
5 adaptors by spun-column chromatography using Sephacryl S- 
300 and size fractionated on a 1% low melting point 
minigel. Size-selected cDNAs (0.5-1, 1-2, 2-3, and 3-6 kb) 
were removed from the gel using agarose (New England 
BioLabs) , phenol -chloroform extracted, and precipitated 

10 using 0.3M NaOAc (pH 7)-ethanol. A portion of each cDNA 
size fraction (0.1 /xg) was coprecipitated with 1 fig of 
XZAPII EcoR J-digested, dephosphorylated arms and then 
ligated overnight in a volume of 4 ftl. Each ligation mix 
was packaged in vitro using Gigapack II Gold packaging 

15 extract (Stratagene) . The structure of this locus was 
determined by Southern hybridization and restriction 
mapping of the XgE2 and g3715. 

The sequence of the EIN2 genomic DNA was 
determined from PGR products and the XgE2 genomic lambda 

20 clone. Primers were selected from the sequence of the 
pcE2.5, pcE2.17, and genomic subclones of XgE2. 
The primers were then commercially synthesized (Research 
Genetics, Huntsville, AL) . 
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Table 3 
PRIMERS FOR THE EIN2 LOCUS 





SEQUENCE 
ID NO. 


Primer 
Name 


Sequence 


position 


5 


21 


PE2 • 7A 


GGATCCTCTAGTCAAATTACCGC 






22 


PE2 .7B 


AGATCTGGTATATTCCGTCTGCAC 






23 


PE2 . 5 ' 


CCGGATTCGGTTTGTAGC 


PGR/ 

j dm 




24 


PE1 


GACGTGCATGTTCTTGGG 






25 


PE2 


GAAAGCCACATCACCTGC 




10 


26 


PE3 


GGGGTGGAGTTATCCAC 






27 


PE4 


GACACCGGGAAGTATCG 






28 


PES 


CTGCTTTCATAGAAGAGGC 


PGR/ 






PE6 




5' end 




30 


PE7 


CACCCAGGTCTTGGTGG 




15 


31 


PE8 


GGCCGCCATGGATGCG 






32 


PE9 


TCTCAATCAAGAGGAGGC 






33 


PE10A 


CTTGAAGGATCCGAGTGG 






34 


PE11 


CAGGTTGGCGAGTTCCTCG 






35 


PE12 


CTTGCTGTTATTCTCCATGC 




20 


36 


PE13 


CCCTGGACCAGCTCCTGG 






37 


PE14 


TGGCGCAAGCATCGTCCC 


PCR/ 
middle 




38 


PE15 


AAATGTTCAGGAATCTCTCG 






39 


PE16 


CTGGCTGGCAGCCACGCC 


PGR/ 
3' end 
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40 


PE17 


GCGTTCTCAAAGCTGCGG 




41 


PE18 


ACTGATGGGTCTTCTGGG 




42 


PE19 


GGATCAGGATGGACCCGG 




43 


PE20 


TGGTTGCTGAAGCCAGGG 




44 


PE21 


TCCATTCATAGAGAGTGGG 




45 


PE22 


ATGCCCAAGAACATGCACG 




46 


PE23 


CAACTGATCCTTTACCCTGC 




47 


PE24 


GTTGTTAGGTCAACTTGCG 


PCR/ 
5' end 


48 


PE25 


CTCTGTTAGGGCTTCCTCC 




49 


PE26A 


GAATCAGATTTCGCGAGG 




50 


PE27 


GTCCAAATGGAGGAAGCC 




51 


PE28 


CCACGACTGTACAATTGACCTTG 


engine- 
ered 
Muni 
site 


52 


PE29 


CATGATCGCAAGTTGACC 




53 


PE30 


AGAAAACTCTTATCAAGCTACG 




54 


PE31 


AAGCTTATGGGTGCTCGTGC 




55 


PE32 


GGAAAGAGAGAAAGACTCAG 




56 


PE33 


GCCACCAAGTCATACCCG 





Primer sequences are set forth 5' to 3'. 
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Four overlapping regions of the ein2 locus 
between 1.2 and 3.2 kb in length were rapidly amplified by 
polymerase chain reactions (Idaho Technologies, Idaho 
falls, Idaho) . Conditions for the PCR reactions are as 
5 follows: 92°C, 2 seconds; 56°C, 2 seconds; 72°C, 1 minute; 
50 cycles. Between 200 and 500 ng of these PCR products 
were directly sequenced on the ABI373A automated sequencer 
using Taq Dye -Terminator chemistry (Applied Biosystems 
Division, PEC) - The genomic sequence of the wild type 
10 Columbia EIN2 locus is shown in Figure 5. Eight mutant 
alleles of ein2 were also sequenced and the corresponding 
mutations identified (Table 4) . The presence of these 
mutations in the mutant alleles of ein2 confirms the 
identity of this gene as EIN2 . 



15 Table 4 

IKDENTIFIED MUTATIONS OF BIN -2 



ALLELE 


MUTAGEN 


MUTATION 


POSITION* 


RESULT 


e±n2-3 


X-ray 


Insert T 


-1-3642 


Frameshif t 


ein2-4 


X-ray 


AG to TT 


+2103 


Frame shift 


e±n2-5 


X-ray 


ACATGACT 


+1570 


Frameshif t 


ein2-6 


Agro- 
bacterium 


AGAGTTGCGC 
ATG 

(SEQ ID 
NO: 17) 


+965 


aGVAH 

(115) 

(SEQ ID 
NO: 18) 


ein2-5 


DEB 


A to C 


+4048 


H to P 


e±n2-ll 


DEB 


TG to AT 


+3492 


Ochre 


ein2-12 


X-ray 


ATGCTACAAT 
CAGAATTCTT , 
GCAGT 
(SEQ ID 
NO: 19) 


+1611 


AATIRILAV 
(SEQ ID 
NO: 20) 


ein2-16 


X-ray 


AGT to G 


+2851 


Frameshif t 



WO 95/35318 



PCT/US95/07744 



- 33 - 

* Position relative to th start of pcE2.17; see Figure 5, 
nucleic acid; position 1 corresponds to the beginning of 
the cDNA. 

EXAMPLE 3 
5 CLONING AND SEQUENCING OF EIN3 

In order to clone the EIN3 gene a collection of 
5000 T-DNA insertion lines (Feldmann and Marks, Mol. Gen. 
Genet. 1987, 208, 1-9, incorporated herein by reference in 
its entirety) was screened for ethylene-insensitive 

10 mutants* A mutant with a phenotype similar to that of 

ein3-l (an EMS generated allele) was identified and genetic 
complementation tests revealed that ein3-l and the T-DNA 
insertion mutant (designated ein3-2) were allelic. 
Complete cosegregation of the mutant phenotype and the 

15 dominant kanamycin resistance marker on the T-DNA indicated 
that the T-DNA insertion was located within, or at least 
very close, to the EIN3 gene. Genomic DNA flanking the 
T-DNA insert was cloned using the left border rescue 
technique. Genomic Southern blots of wild- type and ein3-2 

20 DNA hybridized with the rescued fragment indicated that the 
cloned segment of Arabidopsis DNA corresponded to sequences 
disrupted by the T-DNA insert and did not result from 
cloning an unlinked fragment of genomic DNA. In all 
restriction digests the mobility of the hybridizing 

25 fragments is shifted in the insertion mutant relative to 
wild- type. 

cDNA and genomic libraries constructed from 
wild- type DNA were screened with the rescued DNA fragment. 
The cDNAs obtained indicated the the EIN3 gene encodes a 

30 628 amino acid open reading frame. Structural features of 
the predicted poly peptide include: 1) a region rich in 
acidic amino acids at the amino terminus, 2) several basic 
domains in the central portion of the protein, and 3) 
several poly-asparagine repeats near the carboxy terminus. 

35 Although database searches revealed no overall similarities 
to any characterized proteins, the three structural motifs 
described are found in transcriptional regulatory proteins. 
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Stretches of acidic amino acids function in transcriptional 
activation presumably through binding to other proteins. 
Basic domains serve as nuclear localization signals and can 
bind DNA. Poly asparagine repeats are present in the SWI1 
5 protein of yeast; This protein has been termed a 

transcriptional accessory protein because it is required 
for transcriptional activation of target genes but does not 
bind directly to DNA. It has been suggested that the poly 
asparagine repeats are involved in protein-protein 

10 interactions. 

Sequencing genomic clones indicated that the EIN3 
gene has a very simple structure. There are no introns 
within its open reading frame. However there is a single 
intron located in the 5' transcribed region. In addition 

15 to sequencing the wild- type EIN3 gene, genes from three 
independently isolated ein3 mutants were sequenced. In 
each case an alteration was identified confirming the 
identification of the bona fide EIN3 gene. In the ein3-l 
allele, a point mutation introduces a premature in frame 

20 stop codon. The ein3-2 allele contains a T-DNA insertion 
which interupts the coding region. A point mutation in the 
ein3-3 allele substitutes an acidic amino acid for a basic 
amino acid within one of the basic regions described above. 
The expression pattern of the EIN3 gene in 

25 seedlings was examined by placing the GUS reporter gene 

under control of the EIN3 promoter. The construct employed 
was a translational fusion including 5' non- transcribed 
sequences, the 5' intron and 93 amino acids of the EIN3 
coding region cloned upstream of the GUS gene in the pBHOl 

30 vector (Jefferson et al., EMBO J, 1987, 6, 3901-3907, 

incorporated herein by reference in its entirety) and named 
pHSEIN3GtJS. AraJbidopsis root explants were transformed and 
transgenic plants regenerated (Velvekins et al., PNAS 1988, 
85, 5536-5540, incororated herein by reference in its 

35 entirety) . The GUS activity patterns observed suggest that 
the EIN3 promoter is most active in expanding or elongating 
cells. In three day old etiolated seedlings GUS activity 
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staining is located predominantly in the apical hook and 
root tips. In younger seedlings in which the hypocotyl is 
not fully extended staining is also prevalent throughout 
this tissue. In 14 day old light grown seedlings abundant 
5 GUS activity is observed in the roots, upper portions of 
the hypocotyl, cotyledons and leaves. The EIN3 promoter is 
not induced by ethylene as the levels of GUS activity in 
air and ethylene treated seedlings appear equivalent. This 
observation is supported by the fact that steady state 
10 levels of the endogenous EIN3 transcript are similar in 
ethylene and air treated seedlings and adult plants as 
determined by Northern analysis. 

The EIN3 coding region was cloned downstream of 
the bacterial reporter gene B glucuronidase (GUS) in the 
15 plasmid pRTL2-GUS according to the methods of Restrepo et 
al.. Plant Cell 1990, 2, 987-998, incorporated herein by 
reference in its entirety, to create pNLEIN3Bgl2 (see 

Figure ) . The plasmid was transformed into Arahidopsis 

protoplasts and transiently expressed according to the 
20 methods of Abel and Theologis, Plant iT. 1994, 5, 421-427, 
incorporated herein by reference in its entirety. All 
detectable GUS activity was targeted to the nuclei of the 
protoplasts indicating that the EIN3 protein functions in 
the nucleus. These results suggest that the EIN3 protein 
25 may function as a transcription factor which regulates 
ethylene-regulated gene expression. 

The EIN3 gene is a member of a small gene family. 
Low stringency hybridization of genomic Southern blots 
indicates that there are at least two members in addition 
30 to EIN3. Three EIN3 homologue, designated as EIL1, EIL2, 

and EIL3, have been cloned and sequenced. The EIL and EIN3 
predicted polypeptides structurally similar in that the 
amino termini of both proteins are rich in acidic amino 
acids and their central regions contain several basic 
35 domains. Their carboxyl termini are not as well conserved 
as EIL1 contains a polyglutamine repeat instead of poly 
asparagine repeats. The EIL2 and EIL3 polypeptides do not 
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contain polyglutamine repeats or poly asparagine repeats. 
It is interesting to note that the amino acid substitution 
in the ein3-3 allele occurs in one of the regions rich in 
basic amino acids that is completelty conserved between the 
5 EIN3 and EIL polypeptides. Currently, it is not known 
whether the EIL gene product functions in the ethylene 
signal transduction pathway of Arabidopsis. However at 
this time, the EIL1 and EIL2 cDNAs do not map to the same 
location as any of the characterized ethylene response 

10 mutations. The location of the EIL3 cDNA has not yet been 
mapped. The EIL1 polypeptide is the most similar to EIN3. 

The ein3 mutant alleles were sequenced on an 
Applied Biosystems 373A DNA Sequencing System (Foster City, 
CA) using Tag dideoxy terminator chemistry (Applied 

15 Biosystems) . The PCR primers are set forth in Table 5. 



TABLE 5 
PRIMERS FOR EIN3 PCR 



SEQUENCE 
ID NO. 


PRIMER 
NAME 


SEQUENCE 


POSITION 
in genomic 


57 


PR24 


CCTTCTATATTTGGTTCC 


680-698 


58 


PR15 


CCATTCTCCGGAATAATCC 


1306-1324 


59 


PR5 


CAC6GA6CAGGATAA6G6TA 


1148-1166 


60 


PR19 


CGGATTGGATTGTGTGTGC 


3312-3331 



The primer sequences are set forth 5' to 3' . 



25 Primer pairs PR24 - PR15 and PR5 - PR19 were used 

to amplify genomic DNA from the ein3 mutants. PCR 
amplif ication was performed with a Biosycler Oven (New 
Haven, CT) . Conditions for amplification were as follows: 
92° C for 1 min; 55° C for 1 min.; 72° C for 3 min. The 

30 mutations discovered are listed in Table 6. 
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Table 6 

IDENTIFIED MUTATIONS OF EIN3 



Allele 


Mutagen 


Sequence 
change 


Consequences 
of sequence 
change 


ein3-l 


EMS 


6 to A, 
position 1598 


amino acid 
215, 

W to umber 


ein3-2 


T-DNA 


position 2001 


T-DNA 
insertion 


ein3-3 


DEB 


G to T, 
position 1688 


amino acid 
245, K to N 



The EIL genes were obtained by screening an 
Axabidopsxs seedling cDNA library (Kieber et al.. Cell, 
1993, 72, 427-441, at low stringency in the following 

10 manner. The cDNA library was hybridized with the 

radiolabeled EIN3 cDNA insert at 42° C for 48 hours in a 
hybridization solution consisting of 30% formamide, 5X 
Denhardt's solution, 0.5% SDS, 5X SSPE, 0.1 mg/ml sheared 
salmon sperm DNA, according to the methods of Feinberg and 

15 Vogelstein, Anal. Blochem. 1984, 177, 266-267, 

incorporated herein by reference in its entirety. The 
filters were washed at 42° C with 30% formamide, 0.55 SDS 
(should this be 0.5% SDS?) . 5X SSPE; followed by 2X SSPE. 

EXAMPLE 4 

20 HOOKLESS MUTATION OF THE APICAL HOOK 

The "triple response" in Arabidopsis thai i ana 
occurs in response to the plant hormone ethylene and is 
characterized by three distinct changes in the morphology 
of etiolated seedlings. These include, exaggeration of the 

25 apical hook, radial swelling of the hypocotyl, and 

inhibition of root and hypocotyl elongation. Observation 
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of the apical hook was recorded by Charles Darwin as early 
as 1896. 

The hook causes the apical portion of the 
seedling to become nearly parallel with the basal portion. 
5 Production of the bend in the hypocotyl requires either a 
larger number of cells, or increased elongation of cells on 
the adaxial side (outside) of the hook. A study of the 
characteristics of hook formation in bean seedlings 
demonstrated that the curvature is produced by differential 

10 growth rates on each half of the hypocotyl resulting in 
longer cells on the convex side of the hook, see 
Rubens tein, 1972 Plant Physiology 45:640-643. 

Previous studies suggest that hormones may be 
involved in hook formation. The hormones involved are 

15 believed to be auxin and ethylene. Auxin is known to be a 
controlling factor in cell elongation in the hypocotyl, see 
Klee and Es telle, 1991 Annual Review of Plant Physiology 
42:529-551, incorporated herein by reference in its 
entirety, and ethylene has been shown to exaggerate the 

20 bending of the hook in wild type etiolated seedlings 

(Guzman and Ecker, supra) . One hypothesis to explain hook 
formation is that auxin promotes elongation of cells on the 
outside of the apical hook allowing differential growth 
rates and bending. Work performed by McClure and Guifoyle 

25 (1989) demonstrated that the initial uniform expression of 
small auxin up-RNA (SAUR) mRNA on both sides of the 
hypocotyl was altered when the tissue was transferred from 
an erect to horizontal position. An increase in SAUR mRNA 
accumulation was observed on the "outside" region and a 

30 concurrent rapid decrease in SAUR mRNA occurred on the 

"inside" region of an upward bending hypocotyl. Ethylene 
has been shown to alter transport of auxin in hypocotyl 
tissue (Mattoo and Suttle, supra) , suggesting a possible 
role for ethylene in exaggeration of the hook. To 

35 exaggerate the hook, ethylene might affect auxin 

localization causing even more bending on the outside of 
the hook. 
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The triple response of Arabidopsls has been used 
to isolate mutants affected in the ethylene r sponse. The 
bookless l(hlsl) mutant exhibits a tissue specific defect 
in the triple response- Null mutants (hlsl-1) completely 
5 lack the apical hook in the presence and absence of 
ethylene while weak alleles of hlsl (hlsl-2) show some 
bending in the hook in the presence of ethylene. The 
complementation cross between hlsl-1 and hlsl-2 gave rise 
to Fl progeny which resembled hlsl-2. In addition to hlsl- 

10 1 and hlsl-2, six EMS alleles, three DEB alleles, one X-ray 
allele, and two non- tagged T-DNA alleles have been isolated 
in accordance with the methods set forth in Guzman et al. 
The Plant Cell 1990 2:513-523, hereby incorporated by 
reference in its entirety (Table 7). Seven of these are 

15 strong alleles which are completely bookless in the 
presence of ethylene. Five of these are weak alleles 
showing a partial bend in the presence of ethylene. The 
hlsl phenotype is epistatic in the hook with other ethylene 
mutants . 
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Table 7 

IDENTIFIED PHENOTYPIC AND PROTEIN MUTATIONS OF HLS1 





ALLELE 


MUTAGEN 


HOOK ANGLE 


CHANGE 




hlsl-1 


EMS 


2.2 + 0.9 


aa345 E to K 


5 


hlsl-2 


T-DNA 


26.2 + 3.2 


T-DNA 
insertion 




hlsl-3 


X-RAY 


8.1 + 1.8 


4.8kb 

deletion of 
promoter 




hlsl-4 


DEB 


ND (strong) 


aa345 E to K 




hlsl-5 


DEB 


1.3 + 0.5 


splice donor 
site mutated 




hlsl-6 


EMS 


2.1 + 1.0 


aa326 K to W 


10 


hlsl-7 


DEB 


3.0 + 1.3 


splice donor 
site mutated 




hlsl-8 


EMS 


2.1 ± 1.2 


aal80 R to 
stop ' 




hlsl-9 


EMS 


6.3 ± 1.5 


aall R to 
stop 




hlsl-10 


EMS 


23.2 + 3.0 


aal M to I 




hlsl-U 


T-DNA 


3.0 ± 1.2 


ND 


15 


hlsl-12 


EMS 


ND (weak) 


NC 




hlsl-13 


EMS 


ND (weak) 


NC 




hlsl-14 


T-DNA 


ND (strong) 


ND 



ND = not determined; 

NC = no change in coding region or introns 
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Gene Structure and Analysis 

The HLS1 gene was cloned by left border rescue of 
a T-DNA inserted in the promoter of hlsl -2. The rescued 
fragment was used to isolate a 12kb genomic clone which was 
5 then used to isolate three cDNA clones. The T-DNA was 

found to have inserted 710bp upstream from the 5' end of a 
1.7kb cDNA clone. Deletions of the 1.7kb cDKA clone were 
generated in both directions using Exonuclease III. These 
clones were sequenced using Seguenase 2.0. Deletions of 

10 the genomic clone were also generated using Exonuclease 
III. These clones were also sequenced. The sequence of 
the genomic clone covered the entire 1.7kb cDNA as well as 
1712bp upstream of the start of the cDNA and 313 bp at the 
3' end of the cDNA. This gene has two introns of 342 bp 

15 and 81bp in size. The cDNA encoded a 403 amino acid 
protein of about 43kDa. 



Sequence Analysis of the Alleles 

The hlsl gene from ten of the fourteen alleles 
was sequenced. The transcribed region as well as both 
20 introns were sequenced. The hlsl gene from each allele was 
isolated by PCR amplification. The sequences of the 
primers is set forth in Table 8. 
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Table 8 
PRIMERS FOR HLS1 PCR 





SEQUENCE 
ID NO. 


PRIMER 
NAME 


SEQUENCE 


POSITION 
in genomic 


5 


61 


II. 1 


cgccactgcatgtaagaac 


1303-1321 




62 


II. 2 


tccacacgcttaatacggc 


3229-3211 




63 


11,6 


ggtacggagaagaaggag 


2546-2563 




64 


XII. 1 


cgcgggatattgattcggt 


3071-3090 




65 


III. 2 


gtgttgaacacgcccacaa 


ND 


10 


66 


III. 3 


acgacaccacaaccacct 


3479-3462 




67 


III. 5 


gacaagaagacacaaacc 


3880-3863 




68 


prl 


gaatcggaggagaaggtc 


3386-3403 



Primer sequences are set forth 5' to 3'. 



PCR was performed on a Biosycler (New Haven, CT) . 

15 Conditions were 92° C, 1 min.; 55° C, 1 xain.; 72° C, 3 min. 
for 35 cycles. Some of the PCR products were subcloned and 
sequenced using Sequenase. Additional PCR products were 
sequenced directly using sequence specific primers and Tag 
sequencing on an ABI automated sequencer (Foster City, CA) . 

20 Alleles found to contain a sequence change from wild type 
were confirmed by direct sequencing of the PCR product 
along with a wild type control. The changes found in these 
alleles are listed below in Table 9. 
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Table 9 

IDENTIFIED GKNOTYPIC AND PROTEIN MUTATIONS OF HLS1 





ALLELE 


MUTAGEN 


SEQUENCE 
CHANGE 


CONSEQUENCES 
OF SEQUENCE 
CHANGE 




hlsl-1 


EMS 


6 to A 

position 3487 


aa345 E to K 


5 


hlsl-5 


DEB 


T to A 

position 2194 


splice donor 
site mutated 




hlsl-7 


DEB 


T to A 

position 2194 


splice donor 
site mutated 




hlsl-6 


EMS 


T to G 

position 3431 


aa326 K to W 




hlsl-4 


DEB 


G to A 

position 3487 


aa345 E to K 




hlsl-9 


EMS 


C to T 

position 2060 


aall R to 
stop 

(CGA - TGA) 


10 


hlsl-8 


EMS 


C to T 

position 2992 


aal80 R to 
stop 

(CGA - TGA) 




hlel-10 


EMS 


G to A 

position 2033 


aal M (start) 
to I 



Two alleles which showed no changes in the 
transcribed region or in the introns, hlsl-12 and hlsl-13, 
were both weak alleles, hlsl-12 was found to have reduced 
15 levels of transcript compared with wild type. It is 

possible that there are sequence changes in the promoter 
region of hlsl-12 and hlsl-13. 
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Spatial and Temporal Detection and Expression 

Northern analysis of the alleles revealed weak 
alleles hlsl~2, hlsl-3, hlsl-12 all show a reduction in the 
amount of transcript. The HLS1 transcript was found to be 
5 up regulated by ethylene, 

HLS1 Homology 

Sequence comparison was done at the DNA as well 
as the amino acid level using Blast and TFASTA (GCG) . Some 
homology to one class of acetyl transferases was found. 

10 There are several classes of acetyl transferases with 

little homology between classes* The homology in one class 
of acetyl transferases is comprised of only a loose 
consensus. HLS1 is similar to a class of acetyl 
transferases found in bacteria and yeast and not similar to 

15 the class found in mammalian systems. Tercero, J.C., JBC 
1992, 267, 20270, published a minimum consensus for one 
class of acetyl transferases. Other members of this class 
include yeast MAK3 gene, which acetylates a viral coat 
protein and perhaps some mitochondrial proteins. The rimL 

20 and rimJ proteins are also in this class of acetyl 

transferases. These are E. coli proteins which acetylate 
ribosomal proteins L12 and L5. Also included in this class 
is the AUDI protein of yeast. Mutants in this gene show a 
specific mating defect, an inability to sporulate, and loss 

25 of viability in stationary phaBe. There are several other 
bacterial members of this class. The other 150 amino acids 
of the HLS1 gene show no significant homology to any 
proteins in the database. 

Various modifications of the invention in 

30 addition to those shown and described herein will be 

apparent to those skilled in the art from the foregoing 
description. Such modifications are also intended to fall 
within the scope of the appended claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Trustees of The University of Pennsylvania 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6042 base pairs 

(B) TYPE: nucleic acid 
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,(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTCTCTCTCT CTCTTTGAAG GTGGCACGAG CACCCATAAC CTTCAGACCT ATAGATACAA 60 

ATATGTATGT ATACGTTTTT TATATATAAA TATTTTATAT AATTGATTTT TCGATCTTCT 120 

TTTATCTCTC TCTTTCGATG GAACTGAGCT CTTTCTCTCT TTCCTCTTCT TTTCTCTCTC 180 
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TATCTCTATC 


TCTCGTAGCT 


TGATAAGAGT 


TTCTCTCTTT 


TGAAGATCCG 


TTTCTCTCTC 


240 


TCTCACTGAG 


ACTATTGTTG 


TTAGGTCAAC 


TTGCGATCAT 


GGCGATTTCG 


AAGGTGACTT 


300 


CTTTCAAAAA 


CCCTAATCCT 


CTGTTTTTTT 


TTTTATTTTG 


CTGGGGGGCT 


TTGTACGGAC 


360 


TTTCATGGGT 


TTTTGTAGCT 


TTTCCCTCGG 


CTTTTGCGCA 


AATGAGACTT 


TCTGGGTTTT 


420 


TTTTCCAGCT 


TTTTATAATT 


TCATCAGGTG 


GATCGAATTC 


GTAGTTTCAG 


CTTAGATCTC 


480 


TCTCCCTCTT 


CATTATCTGG 


ACTTTCCAGA 


CTTGGAGTTC 


TTCGGGATTG 


TTTTCGGTTT 


54 0 


CTGGGTTTTG 


TTTTAATTGC 


GAGATTTAAG 


CTTTTTTCTT 


TTTTACTACT 


GTACTTGGTT 


600 


TGTGGTTGAC 


CTTTTTTTTC 


CTTGAAGATC 


TGAATGCGTA 


GATCATACGG 


GATCTTTGCA 


660 


TTTTTGTTGC 


TTTTCGTCAG 


CGTTACGATT 


CTTTTAGCTT 


CAGTTTAGTT 


GAAATTTGTA 


720 


TTTTTTTTGA 


GCTTATCTTC 


TTTTTGTTGC 


TGCTTCATAC 


TAAGATCAAT 


TATTGATTTG 


780 


TAATACTACT 


GTATCTGAAG 


ATTTTCACCA 


TAAAAAAAAA 


ATTCAGGTCT 


GAAGCTGATT 


840 


TCGAATGGTT 


TGGAGATATC 


CGTAGTGGTT 


AAGCATATGG 


AAGTCTATGT 


TCTGCTCTTG 


900 


GTTGCTCTGT 


TAGGGCTTCC 


TCCATTTGGA 


CCAACTTAGC 


TGAATGTTGT 


ATGATCTCTC 


960 


TCCTTGAAGC 


AGCAAATAAG 


AAGAAGGTCT 


GGTCCTTAAC 


TTAACATCTG 


GTTACTAGAG 


1020 


GAAACTTCAG 


CTATTATTAG 


GTAAAGAAAG 


ACTGTACAGA 


GTTGTATAAC 


AAGTAAGCGT 


1080 


TAGAGTGGCT 


TTGTTTGCCT 


CGGTGATAGA 


AGAACCGACT 


GATTCGTTGT 


TGTGTGTTAG 


1140 


CTTTGGAGGG 


AATCAGATTT 


CGCGAGGGAA 


GGTGTTTTAG 


ATCAAATCTG 


TGAATTTTAC 


1200 


TCAACTGAGG 


CTTTTAGTGA 


ACCACGACTG 


TAGAGTTGAC 


CTTGAATCCT 


ACTCTGAGTA 


1260 


ATTATATTAT 


CAGATAGATT 


TAGGATGGAA 


GCTGAAATTG 


TGAATGTGAG 


ACCTCAGCTA 


1320 


GGGTTTATCC 


AGAGAATGGT 


TCCTGCTCTA 


CTTCCTGTCC 


TTTTGGTTTC 


TGTCGGATAT 


1380 


ATTGATCCCG 


GGAAATGGGT 


TGCAAATATC 


GAAGGAGGTG 


CTCGTTTCGG 


GTATGACTTG 


1440 


GTGGCAATTA 


CTCTGCTTTT 


CAATTTTGCC 


GCCATCTTAT 


GCCAATATGT 


TGCAGCTCGC 


1500 


ATAAGCGTTG 


TGACTGGTAA 


ACACTTGGCT 


CAGGTAAACA 


TTTTTCTGAT 


CTCTAAAGAG 


1560 


CAAACTTTTT 


AAAATAACAA 


ACTGGGCTCT 


GTGGTTGTCT 


TGTCACTTTC 


TCAAAGTGGA 


1620 


ATTCTACTAA 


CCACCTTCTC 


TATTTTTCTA 


ACATTTTAAT 


GTTCTTTACT 


GGGACAGATC 


1680 


TGCAATGAAG 


AATATGACAA 


GTGGACGTGC 


ATGTTCTTGG 


GCATTCAGGC 


GGAGTTCTCA 


1740 


GCAATTCTGC 


TCGACCTTAC 


CATGGTAGTT 


ACTTACAATT 


CTTTGCTGTT 


CTTAATTTTT 


1800 


TTATTATGTA 


GTAAAATTTT 


GATTCCTCTG 


ACTTGAGCTT 


CTCTATTATA 


AACAGGTTGT 


1860 


GGGAGTTGCG 


CATGCACTTA 


ACCTTTTGTT 


TGGGGTGGAG 


TTATCCACTG 


GAGTGTTTTT 


1920 


GGCCGCCATG 


GATGCGTTTT 


TATTTCCTGT 


TTTCGCCTCT 


TTCCTTGTAG 


TTACTTACAA 


1980 


TTCTTTGCTG 


TTCTTAATTT 


TTTTATTATG 


TAGTAAAATT 


TTGATTCCTC 


TGACTTGAGC 


2040 


TTCTCTATTA 


TAAACAGGAA 


AATGGTATGG 


CAAATACAGT 


ATCCATTTAC 


TCTGCAGGCC 


2100 


TGGTATTACT 


TCTCTATGTA 


TCTGGCGTCT 


TGCTGAGTCA 


GTCTGAGATC 


CCACTCTCTA 


2160 


TGAATGGAGT 


GTTAACTCGG 


TTAAATGGAG 


AGAGCGCATT 


CGCACTGATG 


GGTCTTCTTG 


2220 



SUBSTITUTE SHEET (HILE 26) 



WO 95/35318 



PCT/US95/07744 



GCGCAAGCAT 
TCTCTTTATA 
CTCTATTACA 
ATTTGTTCGC 
ATGCAGCAGC 
TGTCACTAAT 
TTTTAGGGAA 
ATCACAGGTA 
TAGTCAAATT 
GAAGATAGAA 
TGCGCTTTAT 
CCAGGTCTTG 
GTCGAGACAA 
AACGTTTTTG 
CAGTGACTGG 
CACTCTGCTT 
GCTGAAATCT 
TTTATCTTAT 
AGACGAATCA 
TACTAGCTCG 
GAGCCCTCCA 
TAAGGAAGAC 
CAGTGATAAG 
GGAGAAGATT 
TTCATGGGAA 
TGATGGTCCT 
TTCACGGTTG 
ATTTTGGGGA 
ACTAGATCAG 
TGGAAAAGAC 
GATGACTTCA 
GTTGTATGGA 
GGGTGCATAT 
CTCTAGCCTG 



CGTCCCTCAC 
TGTATCTCTC 
GGAAAGTACA 
CATCTTTGGT 
TAATGTGTTT 
GGAGCAGGTT 
AATGTTCAGA 
TTTATGAGTC 
ACCGCACTAG 
ATACCCGCTT 
TGTGTATGGA 
GTGGCAATGA 
ATCATGGGTG 
GGATTTCTGG 
GCTGGTGGTT 
GTATCGTCAT 
GCGAGTAACA 
CCATCTGTTC 
ATAGTGCGGT 
GTCTATGATT 
GAGGAAAGAG 
TCTGATGTAA 
GATCTGATTG 
GTTAGCATGG 
ACAGAAGAAG 
CCTTCATTCC 
CAAGGTTTGG 
CATTTATATG 
CTGTTTGGCA 
ATTAGCAGTG 
AGTTTATATG 
TTACAAAGAG 
GGTAACACCA 
CGTGCTCCAT 



AATTTTTATA 
TTCTCTGTTA 
TCTTCGTCTG 
GTCTTCAGCG 
CACAGTACTG 
TGTTCTGACG 
AATCTCTCGT 
CGCTCATTCC 
CTTGGGCTTT 
GGCTTCATCG 
CATCTGGTGC 
TGCTTCCTTG 
TCCATAAAAT 
GGTTGAATGT 
TGAGATGGAA 
GTGCATCCTT 
GAGCGGAAGC 
AAGAAGAGGA 
TGGAAAGCAG 
TGCCAGAGAA 
AGTTGGATGT 
AGGAACAGTC 
TTGAAACAAA 
AGAATAACAG 
CTACCAAAGC 
GCAGCTTAAG 
GACGTGCTGC 
ATTTTCATGG 
CTGATCAAAA 
GATATTGCAT 
ATTCACTGAA 
GTTCGTCACC 
CTAATAATAA 
CATCTTCAGA 
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TCCATTCTTA 
AGAAGCAATA 
ATGTCGACAA 
GACTGTCACT 
GCCTTGTGGT 
GTTTTATGTT 
GATTATTAAT 
AGTGGTCTTT 
CGGTGGAGAG 
TGCTACAATC 
AGACGGAATA 
CTCGGTAATA 
CCCTCAGGTT 
TGTTTTTGTT 
TACCGGTATG 
ATGCCTGATA 
TCAAATATGG 
AATTGAAAGA 
GGTAAAGGAT 
CATTCTAATG 
AAAGTACTCT 
TGTATTGCAG 
GATGGCGAAA 
CAAGTTTATT 
TGCTCCTACA 
TGGGGAAGGG 
CCGGAGACAC 
GCAATTGGTT 
GTCAGCCTCT 
GTCACCAACT 
GCAGCAGAGG 
GTCACCGTTG 
TAATGCTTAC 
GGGTTGGGAA 



TTTTGCTGGG 
ATTATACTAA 
GAGCAGCTTG 
TGTAAATTAT 
ACTGACTTTT 
CGTATTAGTC 
TATCTTGTTC 
TTGATGCTCT 
GTCGTCCTGC 
AGAATTCTTG 
TACCAGTTAC 
CCGCTTTTCC 
GGCGAGTTCC 
GTTGAGATGG 
GGCACCTCGA 
CTCTGGCTGG 
AACATGGATG 
ACAGAAACAA 
CAGTTGGATA 
ACGGATCAAG 
ACCTCTCAAG 
TCAACAGTGG 
ATTGAACCAA 
GAAAAGGATG 
AGCAACTTTA 
GGAAGTGGGA 
TTATCTGCGA 
GCTGAAGCCA 
TCTATGAAAG 
GCGAAGGGAA 
ACACCGGGAA 
GTCAACCGTA 
GAATTGAGTG 
CACCAACAAC 



GTACCTTTTT 
GCAGTGAACG 
TGTCAAGACC 
GTATTGATGA 
CACGATGCCT 
AATAATTCAT 
TTGATTGTTG 
TGTTCTTCTC 
ATGACTTCCT 
CAGTTGCTCC 
TTATATTCAC 
GCATTGCTTC 
TCGCACTTAC 
TATTTGGGAG 
TTCAGTACAC 
CAGCCACGCC 
CTCAAAATGC 
GGAGGAACGA 
CTACGTCTGT 
AAATCCGTTC 
TTAGTAGTCT 
TTAATGAGGT 
TGAGTCCTGT 
TTGAAGGGGT 
CTGTCGGATC 
CTGGAAGCCT 
TCCTTGATGA 
GGGCAAAGAA 
CAGATTCGTT 
TGGATTCACA 
GTATCGATTC 
TGCAGATGTT 
AGAGAAGATA 
CAGCTACAGT 



2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 
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(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

CTTTTCTCTC TCTATCTCTA TCTCTCGTAG CTTGATAAGA GTTTCTCTCT TTTGAAGATC 60 

CGTTTCTCTC TCTCTCACTG AGACTATTGT TGTTAGGTCA ACTTGCGATC ATGGCGATTT 120 

CGAAGGTCTG AAGCTGATTT CGAATGGTTT GGAGATATCC GTAGTGGTTA AGCATATGGA 180 

AGTCTATGTT CTGCTCTTGG TTGCTCTGTT AGGGCTTCCT CCATTTGGAC CAACTTAGCT 240 

GAATGTTGTA TGATCTCTCT CCTTGAAGCA GCAAATAAGA AGAAGGTCTG GTCCTTAACT 300 

TAACATCTGG TTACTAGAGG AAACTTCAGC TATTATTAGG TAAAGAAAGA CTGTACAGAG 360 

TTGTATAACA AGTAAGCGTT AGAGTGGCTT TGTTTGCCTC GGTGATAGAA GAACCGACTG 420 

ATTCGTTGTT GTGTGTTAGC TTTGGAGGGA ATCAGATTTC GCGAGGGAAG GTGTTTTAGA 480 

TCAAATCTGT GAATTTTACT CAACTGAGGC TTTTAGTGAA CCACGACTGT AGAGTTGACC 540 

TTGAATCCTA CTCTGAGTAA TTATATTATC AGATAGATTT AGGATGGAAG CTGAAATTGT 600 

GAATGTGAGA CCTCAGCTAG GGTTTATCCA GAGAATGGTT CCTGCTCTAC TTCCTGTCCT 660 

TTTGGTTTCT GTCGGATATA TTGATCCCGG GAAATGGGTT GCAAATATCG AAGGAGGTGC 720 

TCGTTTCGGG TATGACTTGG TGGCAATTAC TCTGCTTTTC AATTTTGCCG CCATCTTATG 780 

CCAATATGTT GCAGCTCGCA TAAGCGTTGT GACTGGTAAA CACTTGGCTC AGATCTGCAA 840 

TGAAGAATAT GACAAGTGGA CGTGCATGTT CTTGGGCATT CAGGCGGAGT TCTCAGCAAT 900 

TCTGCTCGAC CTTACCATGG TTGTGGGAGT TGCGCATGCA CTTAACCTTT TGTTTGGGGT 960 

GGAGTTATCC ACTGGAGTGT TTTTGGCCGC CATGGATGCG TTTTTATTTC CTGTTTTCGC 1020 

CTCTTTCCTT GAAAATGGTA TGGCAAATAC AGTATCCATT TACTCTGCAG GCCTGGTATT 1080 

ACTTCTCTAT GTATCTGGpG TCTTGCTGAG TCAGTCTGAG ATCCCACTCT CTATGAATGG 1140 

AGTGTTAACT CGGTTAAATG GAGAGAGCGC ATTCGCACTG ATGGGTCTTC TTGGCGCAAG 1200 

CATCGTCCCT CACAATTTTT ATATCCATTC TTATTTTGCT GGGGAAAGTA CATCTTCGTC 1260 

TGATGTCGAC AAGAGCAGCT TGTGTCAAGA CCATTTGTTC GCCATCTTTG GTGTCTTCAG 1320 

CGGACTGTCA CTTGTAAATT ATGTATTGAT GAATGCAGCA GCTAATGTGT TTCACAGTAC 1380 

TGGCCTTGTG GTACTGACTT TTCACGATGC CTTGTCACTA ATGGAGCAGG TATTTATGAG 1440 

TCCGCTCATT CCAGTGGTCT TTTTGATGCT CTTGTTCTTC TCTAGTCAAA TTACCGCACT 1500 

AGCTTGGGCT TTCGGTGGAG AGGTCGTCCT GCATGACTTC CTGAAGATAG AAATACCCGC 1560 

TTGGCTTCAT CGTGCTACAA TCAGAATTCT TGCAGTTGCT CCTGCGCTTT ATTGTGTATG 1620 

GACATCTGGT GCAGACGGAA TATACCAGTT ACTTATATTC ACCCAGGTCT TGGTGGCAAT 1680 

GATGCTTCCT TGCTCGGTAA TACCGCTTTT CCGCATTGCT TCGTCGAGAC AAATCATGGG 1740 
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TGTCCATAAA ATCCCTCAGG TTGGCGAGTT CCTCGCACTT ACAACGTTTT TGGGATTTCT 1800 

GGGGTTGAAT GTTGTTTTTG TTGTTGAGAT GGTATTTGGG AGCAGTGACT GGGCTGGTGG 1860 

TTTGAGATGG AATACCGGTA TGGGCACCTC GATTCAGTAC ACCACTCTGC TTGTATCGTC 1920 

ATGTGCATCC TTATGCCTGA TACTCTGGCT GGCAGCCACG CCGCTGAAAT CTGCGAGTAA 1980 

CAGAGCGGAA GCTCAAATAT GGAACATGGA TGCTCAAAAT GCTTTATCTT ATCCATCTGT 2040 

TCAAGAAGAG GAAATTGAAA GAACAGAAAC AAGGAGGAAC GAAGACGAAT CAATAGTGCG 2100 

GTTGGAAAGC AGGGTAAAGG ATCAGTTGGA TACTACGTCT GTTACTAGCT CGGTCTATGA 2160 

TTTGCCAGAG AACATTCTAA TGACGGATCA AGAAATCCGT TCGAGCCCTC CAGAGGAAAG 2220 

AGAGTTGGAT GTAAAGTACT CTACCTCTCA AGTTAGTAGT CTTAAGGAAG ACTCTGATGT 2280 

AAAGGAACAG TCTGTATTGC AGTCAACAGT GGTTAATGAG GTCAGTGATA AGGATCTGAT 2340 

TGTTGAAACA AAGATGGCGA AAATTGAACC AATGAGTCCT GTGGAGAAGA TTGTTAGCAT 2400 

GGAGAATAAC AGCAAGTTTA TTGAAAAGGA TGTTGAAGGG GTTTCATGGG AAACAGAAGA 2460 

AGCTACCAAA GCTGCTCCTA CAAGCAACTT TACTGTCGGA TCTGATGGTC CTCCTTCATT 2520 

CCGCAGCTTA AGTGGGGAAG GGGGAAGTGG GACTGGAAGC CTTTCACGGT TGCAAGGTTT 2580 

GGGACGTGCT GCCCGGAGAC ACTTATCTGC GATCCTTGAT GAATTTTGGG GACATTTATA 2640 

TGATTTTCAT GGGCAATTGG TTGCTGAAGC CAGGGCAAAG AAACTAGATC AGCTGTTTGG 2700 

CACTGATCAA AAGTCAGCCT CTTCTATGAA AGCAGATTCG TTTGGAAAAG ACATTAGCAG 2760 

TGGATATTGC ATGTCACCAA CTGCGAAGGG AATGGATTCA CAGATGACTT CAAGTTTATA 2820 

TGATTCACTG AAGCAGCAGA GGACACCGGG AAGTATCGAT TCGTTGTATG GATTACAAAG 2880 

AGGTTCGTCA CCGTCACCGT TGGTCAACCG TATGCAGATG TTGGGTGCAT ATGGTAACAC 2940 

CACTAATAAT AATAATGCTT ACGAATTGAG TGAGAGAAGA TACTCTAGCC TGCGTGCTCC 3000 

ATCATCTTCA GAGGGTTGGG AACACCAACA ACCAGCTACA GTTCACGGAT ACCAGATGAA 3060 

GTCATATGTA GACAATTTGG CAAAAGAAAG GCTTGAAGCC TTACAATCCC GTGGAGAGAT 3120 

CCCGACATCG AGATCTATGG CGCTTGGTAC ATTGAGCTAT ACACAGCAAC TTGCTTTAGC 3180 

CTTGAAACAG AAGTCCCAGA ATGGTCTAAC CCCTGGACCA GCTCCTGGGT TTGAGAATTT 3240 

TGCTGGGTCT AGAAGCATAT CGCGACAATC TGAAAGATCT TATTACGGTG TTCCATCTTC 3300 

TGGCAATACT GATACTGTTG GCGCAGCAGT AGCCAATGAG AAAAAATATA GTAGCATGCC 3360 

AGATATCTCA GGATTGTCTA TGTCCGCAAG GAACATGCAT TTACCAAACA ACAAGAGTGG 3420 

ATACTGGGAT CCGTCAAGTG GAGGAGGAGG GTATGGTGCG TCTTATGGTC GGTTAAGCAA 3480 

TGAATCATCG TTATATTCTA ATTTGGGGTC ACGGGTGGGA GTACCCTCGA CTTATGATGA 3540 

CATTTCTCAA TCAAGAGGAG GCTACAGAGA TGCCTACAGT TTGCCACAGA GTGCAACAAC 3600 

AGGGACCGGA TCGCTTTGGT CCAGACAGCC CTTTGAGCAG TTTGGTGTAG CGGAGAGGAA 3660 

TGGTGCTGTT GGTGAGGAGC TCAGGAATAG ATCGAATCCG ATCAATATAG ACAACAACGC 372 0 

TTCTTCTAAT GTTGATGCAG AGGCTAAGCT TCTTCAGTCG TTCAGGCACT GTATTCTAAA 3780 
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GCTTATTAAA 


CTTGAAGGAT 


CCGAGTGGTT 


GTTTGGACAA 


AGCGATGGAG 


TTGATGAAGA 


3840 


ACTGATTGAC 


CGGGTAGCTG 


CACGAGAGAA 


GTTTATCTAT 


GAAGCTGAAG 


CTCGAGAAAT 


3900 


AAACCAGGTG 


GGTCACATGG 


GGGAGCCACT 


AATTTCATCG 


GTTCCTAACT 


GTGGAGATGG 


3960 


TTGCGTTTGG 


AGAGCTGATT 


TGATTGTGAG 


CTTTGGAGTT 


TGGTGCATTC 


ACCGTGTCCT 


4020 


TGACTTGTCT 


CTCATGGAGA 


GTCGGCCTGA 


GCTTTGGGGA 


AAGTACACTT 


ACGTTCTCAA 


4080 


CCGCCTACAG 


GGAGTGATTG 


ATCCGGCGTT 


CTCAAAGCTG 


CGGACACCAA 


TGACACCGTG 


4140 


CTTTTGCCTT 


CAGATTCCAG 


CGAGCCACCA 


GAGAGCGAGT 


CCGACTTCAG 


CTAACGGAAT 


4200 


GTTACCTCCG 


GCTGCAAAAC 


CGGCTAAAGG 


CAAATGCACA ACCGCAGTCA 


CACTTCTTGA 


4260 


TCTAATCAAA 


GACGTTGAAA 


TGGCAATCTC 


TTGTAGAAAA 


GGCCGAACCG 


GTACAGCTGC 


4320 


AGGTGATGTG 


GCTTTCCCAA 


AGGGGAAAGA 


GAATTTGGCT 


TCGGTTTCGA AGCGGTATAA 


4380 


ACGTCGGTTA 


TCGAATAAAC 


CAGTAAGGTA 


TGAATCAGGA 


TGGACCCGGT 


TCAAGAAAAA 


4440 


ACGTGACTGC 


GTACGGATCA 


TTGGGTTGAA 


GAAGAAGAAC 


ATTGTGAGAA ATCTCATGAT 


4500 


CAAAGTGACG 


TCGAGAGGGA 


AGCCGAAGAA 


TCAAAACTCT 


CGCTTTTGAT 


TGCTCCTCTG 


4560 


CTTCGTTAAT 


TGTGTATTAA 


GAAAAGAAGA 


AAAAAAATGG 


ATTTTTGTTG 


CTTCAGAATT 


4620 


TTTCGCTCTT 


TTTTTCTTAA 


TTTGGTTGTA 


ATGTTATGTT 


TATATACATA 


TATCATCATC 


4680 


ATAGGACCAT 


AGCTACAAAC 


CGAATCCGGT 


TTGTGTAATT 


CTATGCGGAA 


TCATAAAGAA 


4740 


ATCGTCG 
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(2) INFORMATION FOR SEQ ID NO: 3: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1321 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Glu Ala Glu He Val Asn Val Arg Pro Gin Leu Gly Phe He Gin 
15 10 15 

Arg Met Val Pro Ala Leu Leu Pro Val Leu Leu Val Ser Val Gly Tyr 
20 25 30 

He Asp Pro Gly Lys Trp Val Ala Asn He Glu Gly Gly Ala Arg Phe 
35 40 45 

Glv Tvr Asp Leu Val Ala He Thr Leu Leu Phe Asn Phe Ala Ala He 
50 55 60 

Leu Cvs Gin Tyr Val Ala Ala Arg He Ser Val Val Thr Gly Lys His 
65 70 75 80 
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Leu Ala Gin lie Cys Asn Glu Glu Tyr Asp Lys Trp Thr Cys Met Phe 
85 90 95 

Leu Gly He Gin Ala Glu Phe Ser Ala He Leu Leu Asp Leu Thr Met 
100 105 HO 

Val Val Gly Val Ala His Ala Leu Asn Leu Leu Phe Gly Val Glu Leu 
115 120 125 

Ser Thr Gly Val Phe Leu Ala Ala Met Asp Ala Phe Leu Phe Pro Val 
130 135 140 

Phe Ala Ser Phe Leu Glu Asn Gly Met Ala Asn Thr Val Ser He Tyr 
145 150 155 160 

Ser Ala Gly Leu Val Leu Leu Leu Tyr Val Ser Gly Val Leu Leu Ser 
165 170 175 

Gin Ser Glu He Pro Leu Ser Met Asn Gly Val Leu Thr Arg Leu Asn 
180 185 190 

Gly Glu Ser Ala Phe Ala Leu Met Gly Leu Leu Gly Ala Ser He Val 
195 200 205 

Pro His Asn Phe Tyr He His Ser Tyr Phe Ala Gly Glu Ser Thr Ser 
210 215 220 

Ser Ser Asp Val Asp Lys Ser Ser Leu Cys Gin Asp His Leu Phe Ala 
225 230 235 240 

lie Phe Gly Val Phe Ser Gly Leu Ser Leu Val Asn Tyr Val Leu Met 
* 245 250 255 

Asn Ala Ala Ala Asn Val Phe His Ser Thr Gly Leu Val Val Leu Thr 
260 265 270 

Phe His Asp Ala Leu Ser Leu Met Glu Gin Val Phe Met Ser Pro Leu 
275 280 285 

He Pro Val Val Phe Leu Met Leu Leu Phe Phe Ser Ser Gin He Thr 
290 295 300 

Ala Leu Ala Trp Ala Phe Gly Gly Glu Val Val Leu His Asp Phe Leu 
305 310 315 320 

Lys He Glu He Pro Ala Trp Leu His Arg Ala Thr He Arg He Leu 
325 330 335 

Ala Val Ala Pro Ala Leu Tyr Cys Val Trp Thr Ser Gly Ala Asp Gly 
340 345 350 

He Tyr Gin Leu Leu He Phe Thr Gin Val Leu Val Ala Met Met Leu 
355 350 365 

Pro Cys Ser Val He Pro Leu 'Phe Arg He Ala Ser Ser Arg Gin He 
370 375 380 

Met Gly Val His Lys He Pro Gin Val Gly Glu Phe Leu Ala Leu Thr 
385 ' 390 395 400 

Thr Phe Leu Gly Phe Leu Gly Leu Asn Val Val Phe Val Val Glu Met 
405 410 415 

Val Phe Gly Ser Ser Asp Trp Ala Gly Gly Leu Arg Trp Asn Thr Gly 
420 425 430 

Met Gly Thr Ser He Gin Tyr Thr Thr Leu Leu Val Ser Ser Cys Ala 
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435 440 445 

Ser Leu Cys Leu lie Leu Trp Leu Ala Ala Thr Pro Leu Lys Ser Ala 
450 455 460 

Ser Asn Arg Ala Glu Ala Gin He Trp Asn Met Asp Ala Gin Asn Ala 
465 470 475 480 

Leu Ser Tyr Pro Ser Val Gin Glu Glu Glu He Glu Arg Thr Glu Thr 
485 490 495 

Arg Arg Asn Glu Asp Glu Ser He Val Arg Leu Glu Ser Arg Val Lys 
500 505 510 

Asp Gin Leu Asp Thr Thr Ser Val Thr Ser Ser Val Tyr Asp Leu Pro 
515 520 525 

Glu Asn He Leu Met Thr Asp Gin Glu He Arg Ser Ser Pro Pro Glu 
530 535 540 

Glu Arg Glu Leu Asp Val Lys Tyr Ser Thr Ser Gin Val Ser Ser Leu 
545 550 555 560 

Lys Glu Asp Ser Asp Val Lys Glu Gin Ser Val Leu Gin Ser Thr Val 
565 570 575 

Val Asn Glu Val Ser Asp Lys Asp Leu He Val Glu Thr Lys Met Ala 
580 585 590 

Lys He Glu Pro Met Ser Pro Val Glu Lys lie Val Ser Met Glu Asn 
595 600 605 

Asn Ser Lys Phe He Glu Lys Asp Val Glu Gly Val Ser Trp Glu Thr 
610 615 620 

Glu Glu Ala Thr Lys Ala Ala Pro Thr Ser Asn Phe Thr Val Gly Ser 
625 630 635 640 

Asp Gly Pro Pro Ser Phe Arg Ser Leu Ser Gly Glu Gly Gly Ser Gly 
645 650 655 

Thr Gly Ser Leu Ser Arg Leu Gin Gly Leu Gly Arg Ala Ala Arg Arg 
660 665 670 

His Leu Ser Ala He Leu Asp Glu Phe Trp Gly His Leu Tyr Asp Phe 
675 680 685 

His Gly Gin Leu Val Ala Glu Ala Arg Ala Lys Lys Leu Asp Gin Leu 
690 695 700 

Phe Gly Thr Asp Gin Lys Ser Ala Ser Ser Met Lys Ala Asp Ser Phe 
705 710 715 720 

Glv Lys Asp He Ser Ser Gly Tyr Cys Met Ser Pro Thr Ala Lys Gly 
725 730 735 

Met Asp Ser Gin Met Thr Ser Ser Leu Tyr Asp Ser Leu Lys Gin Gin 
740 745 750 

Arg Thr Pro Gly Ser lie Asp Ser Leu Tyr Gly Leu Gin Arg Gly Ser 
755 760 765 

Ser Pro Ser Pro Leu Val Asn Arg Met Gin Met Leu Gly Ala Tyr Gly 
770 775 780 

Asn Thr Thr Asn Asn Asn Asn Ala Tyr Glu Leu Ser Glu Arg Arg Tyr 
785 790 795 800 
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Ser Ser Leu Arg Ala Pro Ser Ser Ser Glu Gly Trp Glu His Gin Gin 
805 810 815 

Pro Ala Thr Val His Gly Tyr Gin Met Lys Ser Tyr Val Asp Asn Leu 
820 825 830 

Ala Lys Glu Arg Leu Glu Ala Leu Gin Ser Arg Gly Glu He Pro Thr 
835 840 845 

Ser Arg Ser Met Ala Leu Gly Thr Leu Ser Tyr Thr Gin Gin Leu Ala 
850 855 860 

Leu Ala Leu Lys Gin Lys Ser Gin Asn Gly Leu Thr Pro, Gly Pro Ala 
865 870 875 880 

Pro Gly Phe Glu Asn Phe Ala Gly Ser Arg Ser He Ser Arg Gin Ser 
885 890 895 

Glu Arg Ser Tyr Tyr Gly Val Pro Ser Ser Gly Asn Thr Asp Thr Val 
900 905 910 

Gly Ala Ala Val Ala Asn Glu Lys Lys Tyr Ser Ser Met Pro Asp He 
915 920 925 

Ser Gly Leu Ser Met Ser Ala Arg Asn Met His Leu Pro Asn Asn Lys 
930 935 940 

Ser Gly Tyr Trp Asp Pro Ser Ser Gly Gly Gly Gly Tyr Gly Ala Ser 
945 950 955 960 

Tyr Gly Arg Leu Ser Asn Glu Ser Ser Leu Tyr Ser Asn Leu Gly Ser 
965 970 975 

Arg Val Gly Val Pro Ser Thr Tyr Asp Asp He Ser Gin Ser Arg Gly 
980 985 990 

Gly Tyr Arg Asp Ala Tyr Ser Leu Pro Gin Ser Ala Thr Thr Gly Thr 
995 1000 1005 

Gly Ser Leu Trp Ser Arg Gin Pro Phe Glu Gin Phe Gly Val Ala Glu 
1010 1015 1020 

Arg Asn Gly Ala Val Gly Glu Glu Leu Arg Asn Arg Ser Asn Pro He 
1025 1030 1035 1040 

Asn He Asp Asn Asn Ala Ser Ser Asn Val Asp Ala Glu Ala Lys Leu 
1045 1050 1055 

Leu Gin Ser Phe Arg His Cys He Leu Lys Leu He Lys Leu Glu Gly 
1060 1065 1070 

Ser Glu Trp Leu Phe Gly Gin Ser Asp Gly Val Asp Glu Glu Leu He 
1075 1080 1085 

Asp Arg Val Ala Ala Arg Glu Lys Phe He Tyr Glu Ala Glu Ala Arg 
1090 1095 1100 

Glu lie Asn Gin Val Gly His Met Gly Glu Pro Leu He Ser Ser Val 
1105 IHO 1H5 1120 

Pro Asn Cys Gly Asp Gly Cys Val Trp Arg Ala Asp Leu He Val Ser 
1125 H30 H35 

Phe Gly Val Trp Cys He His Arg Val Leu Asp Leu Ser Leu Met Glu 
1140 H45 H50 

Ser Arg Pro Glu Leu Trp Gly Lys Tyr Thr Tyr Val Leu Asn Arg Leu 
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1155 1160 1165 

Gin Gly Val He Asp Pro Ala Phe Ser Lys Leu Arg Thr Pro Met Thr 
1170 H75 1180 

Pro Cys Phe Cys Leu Gin He Pro Ala Ser His Gin Arg Ala Ser Pro 
1185 H90 1195 1200 

Thr Ser Ala Asn Gly Met Leu Pro Pro Ala Ala Lys Pro Ala Lys Gly 
1205 1210 1215 

Lvs Cys Thr Thr Ala Val Thr Leu Leu Asp Leu He Lys Asp Val Glu 
1220 1225 1230 

Met Ala He Ser Cys Arg Lys Gly Arg Thr Gly Thr Ala Ala Gly Asp 
1235 1240 1245 

Val Ala Phe Pro Lys Gly Lys Glu Asn Leu Ala Ser Val Ser Lys Arg 
1250 1255 1260 

Tvr Lvs Arg Arg Leu Ser Asn Lys Pro Val Arg Tyr Glu Ser Gly Trp 
1265 1270 1275 1280 

Thr Arg Phe Lys Lys Lys Arg Asp Cys Val Arg He He Gly Leu Lys 
1285 1290 1295 

Lys Lys Asn He Val Arg Asn Leu Met He Lys Val Thr Ser Arg Gly 
1300 1305 1310 

Lys Pro Lys Asn Gin Asn Ser Arg Phe 
1315 1320 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2310 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCTTCTTCTT CTTCCTCTTC CTCATCTCGT ATCTCTAACT TTTGTCGAAG TTCTTTTGAT 60 

GAAACTAGGG TTTATTATCT TCTCCTTCTT TTTCCCATCA CCATAGAAAA GGCAGAGACC 120 

TTTTTCTTCA TCATTTTTAT TCTCCTTCTT CTTCTGCTGT TCATTTCTCC AGGTTACAAT 180 

GATGTTTAAT GAGATGGGAA TGTGTGGAAA CATGGATTTC TTCTCTTCTG GATCACTTGG 240 

TGAAGTTGAT TTCTGTCCTG TTCCACAAGC TGAGCCTGAT TCCATTGTTG AAGATGACTA 300 

TACTGATGAT GAGATTGATG TTGATGAATT GGAGAGGAGG ATGTGGAGAG ACAAAATGCG 360 

GCTTAAACGT CTCAAGGAGC AGGATAAGGG TAAAGAAGGT GTTGATGCTG CTAAACAGAG 420 

GCAGTCTCAA GAGCAAGCJA GGAGGAAGAA AATGTCTAGA GCTCAAGATG GGATCTTGAA 480 
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GTATATGTTG 


AAGATGATGG 


AAGTTTGTAA 


AGCTCAAGGC 


TTTGTTTATG 


GGATTATTCC 


540 


GGAGAATGGG 


AAGCCTGTGA 


CTGGTGCTTC 


TGATAATTTA 


AGGGAGTGGT 


GGAAAGATAA 


600 


GGTTAGGTTT 


GATCGTAATG 


GTCCTGCGGC 


TATTACCAAG 


TATCAAGCGG 


AGAATAATAT 


660 


CCCGGGGATT 


CATGAAGGTA 


ATAACCCGAT 


TGGACCGACT 


CCTCATACCT 


TGCAAGAGCT 


720 


TCAAGACACG 


ACTCTTGGAT 


CGCTTTTGTC 


TGCGTTGATG 


CAACACTGTG 


ATCCTCCTCA 


780 


GAGACGTTTT 


CCTTTGGAGA 


AAGGAGTTCC 


TCCTCCGCGG 


TGGCCTAATG 


GGAAAGAGGA 


840 


TTGGTGGCCT 


CAACTTGGTT 


TGCCTAAAGA 


TCAAGGTCCT 


GCACCTTACA 


AGAAGCCTCA 


900 


TGATTTGAAG 


AAGGCGTGGA 


AAGTCGGCGT 


TTTGACTGCG 


GTTATCAAGC 


ATATGTTTCC 


960 


TGATATTGCT 


AAGATCCGTA 


AGCTCGTGAG 


GCAATCTAAA 


TGTTTGCAGG 


ATAAGATGAC 


1020 


TGCTAAAGAG 


AGTGCTACCT 


GGCTTGCTAT 


TATTAACCAA 


GAAGAGTCCT 


TGGCTAGAGA 


1080 


GCTTTATCCC 


GAGTCATGTC 


CACCTCTTTC 


TCTGTCTGGT 


GGAAGTTGCT 


CGCTTCTGAT 


1140 


GAATGATTGC 


AGTCAATACG 


ATGTTGAAGG 


TTTCGAGAAG 


GAGTCTCACT 


ATGAAGTGGA 


1200 


AGAGCTCAAG 


CCAGAAAAAG 


TTATGAATTC 


TTCAAACTTT 


GGGATGGTTG 


CTAAAATGCA 


1260 


TGACTTTCCT 


GTCAAAGAAG 


AAGTCCCAGC 


AGGAAACTCG 


GAATTCATGA 


GAAAGAGAAA 


1320 


GCCAAACAGA 


GATCTGAACA 


CTATTATGGA 


CAGAACCGTT 


TTCACCTGCG 


AGAATCTTGG 


1380 


GTGTGCGCAC 


AGCGAAATCA 


GCCGGGGATT 


TCTGGATAGG 


AATTCGAGAG 


ACAACCATCA 


1440 


ACTGGCATGT 


CCACATCGAG 


ACAGTCGCTT 


ACCGTATGGA 


GCAGCACCAT 


CCAGGTTTCA 


1500 


TGTCAATGAA 


GTTAAGCCTG 


TAGTTGGATT 


TCCTCAGCCA 


AGGCCAGTGA 


ACTCAGTAGC 


1560 


CCAACCAATT 


GACTTAACGG 


GTATAGTTCC 


TGAAGATGGA 


CAGAAGATGA 


TCTCAGAGCT 


1620 


CATGTCCATG 


TACGACAGAA ATGTCCAGAG 


CAACCAAACC 


TCTATGGTCA 


TGGAAAATCA 


1680 


AAGCGTGTCA 


CTGCTTCAAC 


CCACAGTCCA 


TAACCATCAA 


GAACATCTCC 


AGTTCCCAGG 


1740 


AAACATGGTG 


GAAGGAAGTT 


TCTTTGAAGA 


CTTGAACATC 


CCAAACAGAG 


CAAACAACAA 


1800 


CAACAGCAGC 


AACAATCAAA 


CGTTTTTTCA AGGGAACAAC 


AACAACAACA ATGTGTTTAA 


1860 


GTTCGACACT 


GCAGATCACA ACAACTTTGA AGCTGCACAT AACAACAACA ATAACAGTAG 


1920 


CGGCAACAGG 


TTCCAGCTTG 


TGTTTGATTC 


CACACCGTTC 


GACATGGCGT 


CATTCGATTA 


1980 


CAGAGATGAT 


ATGTCGATGC 


CAGGAGTAGT AGGAACGATG 


GATGGAATGC 


AGCAGAAGCA 


2040 


GCAAGATGTA 


TCCATATGGT 


TCTAAAGTCT 


TGGTAGTAGA 


TTTCATCTTC 


TCTTATTTTT 


2100 


ATCTTTTGTG 


TTCTTACATT 


CACTCAACCA 


TGTAATATTT 


TTTCCTGGGT 


CTCTCTGTCT 


2160 


CTATCGCTTG 


TTATGATGTG 


TCTGTAAGAG 


TCTCTAAAAA 


CTCTCTGTTA 


CTGTGTGTCT 


2220 


TTGTCTCGGC 


TTGGTGAATC 


TCTCTGTCAT 


CATCAGCTTT 


TAGTTACACA 


CCCGACTTGG 


2280 


GGATGAACGA 


ACACTAAATG 


TAAGTTTTCA 








2310 


(2) INFORMATION FOR SEQ ID NO: 5: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3387 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 





a x a xximp p b. p 


Wfi-fZPPrtPXXX 


XT & 21 XT & P 71 


TATTAATTGT 


GTAATAATAA 


en 

O V 


Tft 7\ T T\ ft T\ »TV^ ft 

TAATAAATGA 


TGTCTTAAAT 


* 1 *m " i ■ ft hp^s.*^* * ' 1 * ft 

X X iAiululn 


& fi a a a xfi n 7a a 


TTAAAATGAT 


ATATATGTAT 


1 O A 
XZ V 


ATTATATATC 


*P * \T ft r* ft *H ft T 7V 

TANACATATA 




X 71 fi 7A X 1 P 7A P X 


ATATATACTA 


TGATCTATCT 


lb U 


TCCTGATCTA 


oft^ft^ft^ft ^ ,f p 
CAGAG AG At. T 


r^^ft r*ft ft Rr* ft ft 


ft rTlf 7i ft 7\T7l ft 

iivbWuviiiiii 


ACAAAAGTCG 


CTTTCTAGCC 


*5A ft 


ACGTGATCTT 


TCGTCGACTT 


TTCTTCTTCT 


TCTTCTTCTT 


CCTCTTCCTC 


ATCTCGTATC 




TCTAACTTTT 


GTCGAAGTTC 


TTTTGATGAA 


ACTAGGGTTT 


ATTATCTTCT 


CCTTCTTTTT 


360 


CCCATCACCA 


TAGAAAAGGC 


AGAGACCTTT 


TTCTTCATCA 


TTTTTATTCT 


CCTTCTTCTT 


420 


CTGCTGTTCA 


TTTCTCCAGG 


TACTATACGC 


TTCTTCTTCT 


ATTGATTTTT 


TAGGGTTATT 


480 


ATTGATACTG 


AAGATGATGA 


TAGGTTTATT 


CATAGGGTTT 


TACTAGATCG 


ATGGTTTTAC 


540 


TTTAGTTTAC 


TAGTGTTTAC 


ACGATCTAAT 


TTCATGAGTT 


TATNCTACTT 


TTAGTTTTTT 


600 


NTTTGGGTGA 


AGTTTTGTTT 


ATTG TTT AT A 


AATCGTTGAT 


CTATTTGAAA ATGTTTTCTC 


CCft 

OO u 


TTTCTTATTC 


ft ^ft 171 ik n*^ft 
ATATATGATC 


CTTT CT AT AT 


TTGGTTCCTA 


TGTTGAAGAT 


CTCATCCTTT 




TTTTGGAAAT 


TGAATCTGTT 


GATAATTTTT 


ft*m*ft*pr^^v^ft f p 


TGATTATTTA 


GTTTAGGAGT 




^* ft ft ft. ft ^Pft 




nlVjlul X Xnl 


Xft, P*T*XR\ ft. ft. ft P 


TTTGATTGAA 


TTCGAAAAGC 


840 


CCCTTTTTTA 


TAATTTAGGG 


TTTGATGATT 


TTTTTTAGTA 


AGTTGTTTGA 


TTCAGAAGAA 


900 


ATATAATTGT 


ACTGATTAGT 


TTTGTTTGTG 


TATTTGATTT 


GTTACAGGTT 


ACAATGATGT 


960 


TTAATGAGAT 


GGGAATGTGT 


GGAAACATGG 


ATTTCTTCTC 


TTCTGGATCA 


CTTGGTGAAG 


1020 


TTGATTTCTG 


TCCTGTTCCA 


CAAGCTGAGC 


CTGATTCCAT 


TGTTGAAGAT 


GACTATACTG 


1080 


ATGATGAGAT 


TGATGTTGAT 


GAATTGGAGA 


GGAGGATGTG 


GAGAGACAAA 


ATGCGGCTTA 


1140 


AACGTCTCAA 


GGAGCAGGAT 


AAGGGTAAAG 


AAGGTGTTGA 


TGCTGCTAAA 


CAGAGGCAGT 


1200 


CTCAAGAGCA 


AGCTAGGAGG 


AAGAAAATGT 


CTAGAGCTCA 


AGATGGGATC 


TTGAAGTATA 


1260 


TGTTGAAGAT 


GATGGAAGTT 


TGTAAAGCTC 


AAGGCTTTGT 


TTATGGGATT 


ATTCCGGAGA 


1320 


ATGGGAAGCC 


TGTGACTGGT 


GCTTCTGATA 


ATTTAAGGGA 


GTGGTGGAAA GATAAGGTTA 


1380 


GGTTTGATCG 


TAATGGTCCT 


GCGGCTATTA 


CCAAGTATCA 


AGCGGAGAAT 


AATATCCCGG 


1440 


GGATTCATGA 


AGGTAATAAC 


CCGATTGGAC 


CGACTCCTCA 


TACCTTGCAA 


GAGCTTCAAG 


1500 


ACACGACTCT 


TGGATCGCTT 


TTGTCTGCGT 


TGATGCAACA 


CTGTGATCCT 


CCT CAGAG AC 


1560 


GTTTTCCTTT 


GGAGAAAGGA 


GTTCCTCCTC 


CGTGGTGGCC 


TAATGGGAAA 


GAGGATTGGT 


1620 
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GGCCTCAACT 


TGGTTTGCCT 


AAAGATCAAG 


GTCCTGCACC 


TTACAAGAA3 


CCTCATGATT 


1680 


TGAAGAAGGC 


GTGGAAAGTC 


GGCGTTTTGA 


CTGCGGTTAT 


CAAGCATATG 


TTTCCTGATA 


1740 


TTGCTAAGAT 


CCGTAAGCTC 


GTGAGGCAAT 


CTAAATGTTT 


GCAGGATAAG 


ATGAC TGCTA 


1800 


AAGAGAGTGC 


TACCTGGCTT 


GCTATTATTA 


ACCAAGAAGA 


GTCCTTGGCT 


AGAGAGCTTT 


1860 


ATCCCGAGTC 


ATGTCCACCT 


CTTTCTCTGT 


CTGGTGGAAG 


TTGCTCGCTT 


CTGATGAATG 


1920 


ATTGCAGTCA 


ATACGATGTT 


GAAGGTTTCG 


AGAAGGAGTC 


TCACTATGAA 


GTGGAAGAGC 


1980 


TCAAGCCAGA 


AAAAGTTATG 


AATTCTTCAA 


ACTTTGGGAT 


GGTTGCTAAA 


ATGCATGACT 


2040 


TTCCTGTCAA 


AGAAGAAGTC 


CCAGCAGGAA 


ACTCGGAATT 


CATGAGAAAG 


AGAAAGCCAA 


2100 


ACAGAGATCT 


GAACACTATT 


ATGGACAGAA 


CCGTTTTCAC 


CTGCGAGAAT 


CTTGGGTGTG 


2160 


CGCACAGCGA 


AATCAGCCGG 


GGATTTCTGG 


ATAGGAATTC 


GAGAGACAAC 


CATCAACTGG 


2220 


CATGTCCACA 


TCGAGACAGT 


CGCTTACCGT 


ATGGAGCAGC 


ACCATCCAGG 


TTTCATGTCA 


2280 


ATGAAGTTAA 


GCCTGTAGTT 


GGATTTCCTC 


AGCCAAGGCC 


AGTGAACTCA GTAGCCCAAC 


2340 


CAATTGACTT 


AACGGGTATA 


GTTCCTGAAG 


ATGGACAGAA 


GATGATCTCA 


GAGCTCATGT 


2400 


CCATGTACGA 


CAGAAATGTC 


CAGAGCAACC 


AAACCTCTAT 


GGTCATGGAA 


AATCAAAGCG 


2460 


TGTCACTGCT 


TCAACCCACA 


GTCCATAACC 


ATCAAGAACA 


TCTCCAGTTC 


CCAGGAAACA 


2520 


TGGTGGAAGG 


AAGTTTCTTT 


GAAGACTTGA 


ACATCCCAAA 


CAGAGCAAAC 


AACAACAACA 


2580 


GCAGCAACAA 


TCAAACGTTT 


TTTCAAGGGA 


ACAACAACAA 


CAACAATGTG 


TTTAAGTTCG 


2640 


ACACTGCAGA 


TCACAACAAC 


TTTGAAGCTG 


CACATAACAA 


CAACAATAAC 


AGTAGCGGCA 


2700 


ACAGGTTCCA 


GCTTGTGTTT 


GATTCCACAC 


CGTTCGACAT 


GGCGTCATTC 


GATTACAGAG 


2760 


ATGATATGTC 


GATGCCAGGA 


GTAGTAGGAA 


CGATGGATGG 


AATGCAGCAG 


AAGCAGCAAG 


2820 


ATGTATCCAT 


ATGGTTCTAA 


AGTCTTGGTA 


GTAGATTTCA TCTTCTCTTA TTTTTATCTT 


2860 


TTGTGTTCTT 


ACATTCACTC 


AACCATGTAA 


TATTTTTTCC 


TGGGTCTCTC 


TGTCTCTATC 


2940 


GCTTGTTATG 


ATGTGTCTGT 


AAGAGTCTCT 


AAAAACTCTC 


TGTTACTGTG 


TGTCTTTGTC 


3000 


TCGGCTTGGT 


GAATCTCTCT 


GTCATCATCA 


GCTTTTAGTT 


ACACACCCGA 


CTTGGGGATG 


3060 


AACGAACACT 


AAATGTAAGT 


TTTCATAATA 


TAAATATATT 


TGNAAGCTCT 


CTTCTTCTGT 


3120 


GTGTTTTGGT 


TGAGTTTGAC 


TTTTACAATT 


GAAAAGTTTG 


GTGTAATTCA 


CGCTAACTAC 


3180 


CTCAAAGTTA 


GGGAATGGTG 


GGATAATTAT 


TTATTACAAT 


TGTATTTGAT 


GGATAACGTG 


3240 


CTTATCGCTA 


GTGGCTCGCG 


GGTAGCATTT 


AAGCATGGGT 


CAATGCTTGT 


GTCTACGAGC 


3300 


TCGAGTGATC 


GAGCACACAC 


AATCCAATCC 


GAACACAAAA 


CAAGAAGAAA AACAAAATAA 


3360 


GATCTTAGAT 


GTAAGGNATT 


CTTAAAT 








3387 


(2) INFORMATION FOR SEQ ID NO: 6: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 628 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

Met Met Phe Asn Glu Met Gly Met Cys Gly Asn Met Asp Phe Phe Ser 
15 10 15 

Ser Gly Ser Leu Gly Glu Val Asp Phe Cys Pro Vai Pro Gin Ala Glu 
20 25 30 

Pro Asp Ser lie Val Glu Asp Asp Tyr Thr Asp Asp Glu lie Asp Val 
35 40 45 

Asp Glu Leu Glu Arg Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 €0 

Leu Lys Glu Gin Asp Lys Gly Lys Glu Gly Val Asp Ala Ala Lys Gin 
65 70 75 80 

Arg Gin Ser Gin Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala Gin 
85 90 95 

Asp Gly lie Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys Ala 
100 105 110 

Gin Gly Phe Val Tyr Gly lie lie Pro Glu Asn Gly Lys Pro Val Thr 
115 120 125 

Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg Phe 
130 135 140 

Asp Arg Asn Gly Pro Ala Ala lie Thr Lys Tyr Gin Ala Glu Asn Asn 
145 150 155 160 

He Pro Gly He His Glu Gly Asn Asn Pro He Gly Pro Thr Pro His 
165 170 175 

Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu Ser Ala 
180 185 190 

Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu Glu Lys 
195 200 205 

Gly Val Pro Pro Pro Trp Trp Pro Asn Gly Lys Glu Asp Trp Trp Pro 
210 215 220 

Gin Leu Gly Leu Pro Lys Asp Gin Gly Pro Ala Pro Tyr Lys Lys Pro 
225 230 235 240 

His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala Val He 
245 250 255 

Lys His Met Phe Pro Asp He Ala Lys lie Arg Lys Leu Val Arg Gin 
260 265 270 

Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala Thr Trp 
275 280 285 

Leu Ala He He Asn Gin Glu Glu Ser Leu Ala Arg Glu Leu Tyr Pro 
290 295 300 
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Glu Ser Cys Pro Pro Leu Ser Leu Ser Gly Gly Ser Cys Ser Leu Leu 
305 310 315 320 

Met Asn Asp Cys Ser Gin Tyr Asp Val Glu Gly Phe Glu Lys Glu Ser 
325 330 335 

His Tyr Glu Val Glu Glu Leu Lys Pro Glu Lys Val Met Asn Ser Ser 
340 345 350 

Asn Phe Gly Met Val Ala Lys Met His Asp Phe Pro Val Lys Glu Glu 
355 360 365 

Val Pro Ala Gly Asn Ser Glu Phe Met Arg Lys Arg Lys Pro Asn Arg 
370 375 380 

Asp Leu Asn Thr lie Met Asp Arg Thr Val Phe Thr Cys Glu Asn Leu 
385 390 395 400 

Gly Cys Ala His Ser Glu He Ser Arg Gly Phe Leu Asp Arg Asn Ser 
405 410 415 

Arg Asp Asn His Gin Leu Ala Cys Pro His Arg Asp Ser Arg Leu Pro 
420 425 430 

Tyr Gly Ala Ala Pro Ser Arg Phe His Val Asn Glu Val Lys Pro Val 
435 440 445 

Val Gly Phe Pro Gin Pro Arg Pro Val Asn Ser Val Ala Gin Pro He 
450 455 460 

Asp Leu Thr Gly He Val Pro Glu Asp Gly Gin Lys Met He Ser Glu 
465 470 475 480 

Leu Met Ser Met Tyr Asp Arg Asn Val Gin Ser Asn Gin Thr Ser Met 
485 490 495 

Val Met Glu Asn Gin Ser Val Ser Leu Leu Gin Pro Thr Val His Asn 
500 505 510 

His Gin Glu His Leu Gin Phe Pro Gly Asn Met Val Glu Gly Ser Phe 
515 520 525 

Phe Glu Asp Leu Asn He Pro Asn Arg Ala Asn Asn Asn Asn Ser Ser 
530 535 540 

Asn Asn Gin Thr Phe Phe Gin Gly Asn Asn Asn Asn Asn Asn Val Phe 
545 550 555 560 

Lys Phe Asp Thr Ala Asp His Asn Asn Phe Glu Ala Ala His Asn Asn 
565 570 575 

Asn Asn Asn Ser Ser Gly Asn Arg Phe Gin Leu Val Phe Asp Ser Thr 
580 585 590 

Pro Phe Asp Met Ala Ser Phe Asp Tyr Arg Asp Asp Met Ser Met Pro 
595 €00 605 

Gly Val Val Gly Thr Met Asp Gly Met Gin Gin Lys Gin Gin Asp Val 
610 615 620 

Ser He Trp Phe 
625 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2234 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



GGCCGCTTCA 


AACTCTACAA 


ACCCAGAAAC 


CACCACACAG 


TAATTAATGT 


CTCTTTCTTT 


60 


CTTCCCATGT 


GATCTTTAAC 


AGACTTTTCT 


TCTTATTCTC 


CATCTCTGAA 


GTTGTGGGGA 


120 


TTCATCAAGA 


CTTCCTTATC 


TGTTTCTTTT 


ATAAAACAAG 


AGAGAGATAC 


CACTTTTGGT 


180 


GTTCTTTATT 


TGCAACTCTT 


TCAGGTTAAA 


GAAATCGATA 


GGCTCTGTTC 


TTGATTGTGG 


240 


TGGAAGAGAC 


ATGATGATGT 


TTAACGAGAT 


GGGAATGTAT 


GGAAACATGG 


ATTTCTTCTC 


300 


TTCCTCCACA 


TCTCTCGATG 


TGTGTCCATT 


ACCACAAGCT 


GAACAAGAAC 


CTGTAGTTGA 


360 


AGATGTCGAC 


TACACCGATG 


ATGAGATGGA 


TGAGCTTGAG 


CAGAGGATGT 


GGAGAGACAA 


420 


AATGCGTTTG 


AAACGTCTCA 


AGGAGCAACA 


GAGTAAGTGT 


AAAGGAGGCG 


TCGATGGTTC 


480 


GAAACAGAGG 


CAGTCGCAAG 


AGCAAGCTAG 


GAGGAAGAAA 


ATGTCTAGAG 


CCCAAGATGG 


540 


GATCTTGAAG 


TATATGTTGA 


AGATGATGGA 


AGTTTGTAAA 


GCTCAAGGCT 


TTGTTTATGG 


600 


TATTATTCCT 


GAGAAGGGTA 


AGCCTGTGAC 


TGGTGCTTCG 


GATAATTTGA 


GGGAATGGTG 


€60 


GAAAGATAAG 


GTTAGGTTTG 


ATCGTAATGG 


TCCAGCTGCT 


ATTGCTAAGT 


ATCAGTCAGA 


720 


GAATAATATT 


TCTGGAGGGA 


GTAATGATTG 


TAACAGCTTG 


GTTGGTCCAA 


CACCGCATAC 


780 


GCTTCAGGAG 


CTTCAGGACA 


CGACTCTTGG 


TTCGCTTTTA 


TCGGCTTTGA 


TGCAACATTG 


840 


TGATCCACCG 


CAGAGACGGT 


TTCCTTTGGA 


GAAAGGAGTT 


TCTCCACCTT 


GGTGGCCTAA 


900 


TGGGAATGAA 


GAGTGGTGGC 


CTCAGCTTGG 


TTTACCAAAT 


GAGCAAGGTC 


CTCCTCCTTA 


960 


TAAGAAGCCT 


CATGATTTGA 


AGAAAGCTTG 


GAAAGTCGGT 


GTTTTAACTG 


CGGTGATCAA 


1020 


GCATATGTCG 


CCGGATATTG 


CGAAGATCCG 


TAAGCTTGTG 


AGGCAATCAA 


AATGCTTGCA 


1080 


GGATAAGATG 


ACGGCGAAAG 


AGAGTGCTAC 


TTGGCTTGCC 


ATTATTAACC 


AAGAAGAGGT 


1140 


TGTGGCTCGG 


GAGCTTTATC 


CCGAGTCATG 


CCCTCCTCTT 


TCTTCTTCTT 


CATCATTAGG 


1200 


AAGCGGGTCG 


CTTCT CATTA 


ATGATTGTAG 


CGAGTATGAC 


GTTGAAGGTT 


TCGAGAAGGA 


1260 


ACAACATGGT 


TTCGATGTGG 


AAGAGCGGAA ACCAGAGATA 


GTGATGATGC 


ATCCTCTAGC 


1320 


AAGCTTTGGG 


GTTGCTAAAA 


TGCAACATTT 


TCCCATAAAG 


GiGGAGGTCG CCACCACGGT 


1380 


AAACTTAGAG 


TTCACGAGAA 


AGAGGAAGCA 


GAACAATGAT 


ATGAATGTTA 


TGGTAATGGA 


1440 


CAGATCAGCA 


GGTTACACTT 


GTGAGAATGG 


TCAGTGTCCT 


CACAGCAAAA 


TGAATCTTGG 


1500 


ATTTCAAGAC 


AGGAGTTCAA 


GGGACAACCA 


CCAGATGGTT 


TGTCCATATA 


GAGACAATCG 


1560 


TTTAGCGTAT 


GGAGCATCCA 


AGTTTCATAT 


GGGTGGAATG 


AAACTAGTAG 


TTCCTCAGCA 


1620 
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ACCAGTCCAA CCGATCGACC TATCGGGCGT TGGAGTTCCG GAAAACGGGC AGAAGATGAT 1680 

CACCGAGCTT ATGGCCATGT ACGACAGAAA TGTCCAAAGC AACCAAACGC CTCCTACTTT 1740 

GATGGAAAAC CAAAGCATGG TCATTGATGC AAAAGCAGCT CAGAATCAGC AGCTGAATTT IB 00 

CAACAGTGGC AATCAAATGT TTATGCAACA AGGGACGAAC AACGGGGTTA ACAATCGGTT 1860 

CCAGATGGTG TTTGATTCGA CACCATTCGA TATGGCAGCA TTCGATTACA GAGATGATTG 1920 

GCAAACCGGA GCAATGGAAG GAATGGGGAA GCAGCAGCAG CAGCAGCAGC AGCAGCAAAG 1980 

ATGTATCAAT ATGGTTCTGA ATATTACACA ATCTCTGTAA TATTCATTCT TTCATAATAA 2040 

CTCTGTTACC TACTTACCTG ACTTGGGTAT GTATTCTATT GCACCAAACA CTCATCTATA 2100 

TTGTTGATGA TGATGAAGCC ATCTATTTTT TTTTTGTGTC TGAAAGTCAT TTAACTCGCT 2160 

TCATTGTTTT AATAATGTCA CTATCCATTG AACATCATTC TCATGCTACA AGTTTGATTC 2220 

TTTGAGGCGG CCGC 2234 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 584 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Met Met Phe Asn Glu Met Gly Met Tyr Gly Asn Met Asp Phe Phe 
15 10 15 

Ser Ser Ser Thr Ser Leu Asp Val Cys Pro Leu Pro Gin Ala Glu Gin 
20 25 30 

Glu Pro Val Val Glu Asp Val Asp Tyr Thr Asp Asp Glu Met Asp Val 
35 40 45 

Asp Glu Leu Glu Lys Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 60 

Leu Lys Glu Gin Gin Ser Lys Cys Lys Glu Gly Val Asp Gly Ser Lys 
65 70 75 80 

Gin Arg Gin Ser Gin Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala 
85 90 95 

Gin Asp Gly lie Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys 
100 105 110 

Ala Gin Gly Phe Val Tyr Gly lie He Pro Glu Lys Gly Lys Pro Val 
115 120 125 

Thr Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg 
130 135 140 
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Phe Asp Arg Asn Gly Pro Ala Ala lie Ala Lys Tyr Gin Ser Glu Asn 
145 150 155 160 

Asn lie Ser Gly Gly Ser Asn Asp Cys Asn Ser Leu Val Gly Pro Thr 
165 170 175 

Pro His Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu 
180 185 190 

Ser Ala Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu 
195 200 205 

Glu Lys Gly Val Ser Pro Pro Trp Trp Pro Asn Gly Asn Glu Glu Trp 
210 215 220 

Trp Pro Gin Leu Gly Leu Pro Asn Glu Gin Gly Pro Pro Pro Tyr Lys 
225 230 235 240 

Lys Pro His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala 
245 250 255 

Val lie Lys His Met Ser Pro Asp lie Ala Lys He Arg Lys Leu Val 
260 265 270 

Arg Gin Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala 
275 280 285 

Thr Trp Leu Ala He He Asn Gin Glu Glu Val Val Ala Arg Glu Leu 
290 295 300 

Tyr Pro Glu Ser Cys Pro Pro Leu Ser Ser Ser Ser Ser Leu Gly Ser 
305 310 315 320 

Gly Ser Leu Leu He Asn Asp Cys Ser Glu Tyr Asp Val Glu Gly Phe 
325 330 335 

Glu Lys Glu Gin His Gly Phe Asp Val Glu Glu Arg Lys Pro Glu He 
340 345 350 

Val Met Met His Pro Leu Ala Ser Phe Gly Val Ala Lys Met Gin His 
355 360 365 

Phe Pro lie Lys Glu Glu Val Ala Thr Thr Val Asn Leu Glu Phe Thr 
370 375 380 

Arg Lys Arg Lys Gin Asn Asn Asp Met Asn Val Met Val Met Asp Arg 
385 390 395 400 

Ser Ala Gly Tyr Thr Cys Glu Asn Gly Gin Cys Pro His Ser Lys Met 
405 410 415 

Asn Leu Gly Phe Gin Asp Arg Ser Ser Arg Asp Asn His Gin Met Val 
420 425 430 

Cys Pro Tyr Arg Asp Asn Arg Leu Ala Tyr Gly Ala Ser Lys Phe His 
435 440 445 

Met Gly Gly Met Lys Leu Val Val Pro Gin Gin Pro Val Gin Pro He 
450 455 v 460 

Asp Leu Ser Gly Val Gly Val Pro Glu Asn Gly Gin Lys Met He Thr 
465 470 475 480 

Glu Leu Met Ala Met Tyr Asp Arg Asn Val Gin Ser Asn Gin Thr Pro 
485 490 495 

Pro Thr Leu Met Glu Asn Gin Ser Met Val He Asp Ala Lys Ala Ala 
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500 



505 



510 



Gin Asn Gin Gin Leu Asn Phe 
515 



Asn Ser 
520 



Gly Asn Gin Met Phe Met Gin 
525 



Gin Gly Thr Asn Asn Gly Val 
530 535 



Asn Asn 



Arg Phe Gin Met Val Phe Asp 
540 



Ser Thr Pro Phe Asp Met Ala 
545 550 



Ala Phe 



Asp Tyr Arg Asp Asp Trp Gin 
555 560 



Thr Gly Ala Met Glu Gly Met 
565 



Gly Lys 



Gin Gin Gin Gin Gin Gin Gin 
570 575 



Gin Gin Asp Val Ser lie Trp Phe 
580 

(2) INFORMATION FOR. SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CAGATTCTAT GGATATGTAT AACAACAATA TAGGGATGTT CCGGAGTTTA GTTTGTAGCT 60 

CGGCGCCTCC ATTTACAGAG GGACATATGT GTTCTGATTC GCATACGGCT TTGTGCGATG 120 

ATCTGAGTAG TGATGAGGAA ATGGAAATAG A3GAGCTTGA GAAGAAGATC TGGAGAGACA 180 

AGCAGCGTTT AAAGCGGCTC AAGGAAATGG CGAAGAACGG TCTAGGAACA AGATTGTTGT 240 

TGAAGCAGCA ACATGATGAT TTTCCAGAGC ACTCTAGTAA GAGAACCATG TACAAGGCAC 300 

AAGATGGGAT CTTGAAGTAC ATGTCGAAGA CAATGGAGCG ATATAAAGCT CAAGGTTTTG 360 

TTTATGGGAT TGTGTTAGAG AATGGGAAAA CGGTAGCGGG ATCTTCTGAT AATCTCCGTG 420 

AATGGTGGAA AGACAAAGTG AGGTTTGATA GGAACGGCCC AGCTGCTATA ATCAAGCACC 480 

AAAGGGATAT CAATCTTTCT GATGGAAGTG ATTCAGGGTC TGAGGTTGGG GATTCTACCG 540 

CACAGAAGTT GCTTGAGCTT CAAGATACTA CTCTTGGAGC TCTGTTATCG GCTCTGTTTC 600 

CTCACTGCAA CCCTCCTCAG AGGCGGTTTC CGTTGGAGAA AGGCGTGACA CCGCCATGGT 660 

GGCCAACGGG GAAAGAAGAT TGGTGGGATC AACTGTCTTT ACCCGTTGAT TTTCGAGGTG 720 

TTCCGCCACC TTACAAGAAG CCTCATGATC TCAAGAAGCT GTGGAAAATT GGTGTTTTGA 780 

TTGGTGTAAT CAGACATATG GCTTCTGACA TTAGCAACAT ACCCAATCTC GTGAGACGGT 840 

CTAGAAGTTT GCAGGAGAAA ATGACGTCAA GAGAAGGCGC TTTATGGCTC GCTGCTCTTT 900 

ACCGAGAAAA GGCTATTGTT GATCAAATAG CCATGTCTAG AGAAAACAAC AACACTTCTA 960 
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ACTTTCTTGT TCCTGCAACC GGTGGAGACC CAGATGTTTT GTTTCCTGAA TCTACAGACT 1020 

ATGATGTTGA ACTGATTGGT GGCACTCATC GGACCAATCA GCAGTATCCT GAATTTGAAA 1080 

ACAACTACAA CTGTGTTTAC AAGAGAAAGT TTGAAGAAGA TTTTGGGATG CCAATGCATC 114 0 

CAACACTCCT AACATGTGAG AACAGTCTCT GTCCTTATAG CCAACCACAT ATGGGATTTC 1200 

TTGACAGGAA CTTAAGAGAG AATCACCAAA TGACTTGTCC TTATAAAGTC ACTTCCTTCT 1260 

ACCAACCAAC TAAACCCTAT GGTATGACGG GTTTAATGGT TCCTTGTCCG GATTATAACG 1320 

GGATGCAGCA GCAGGTTCAG AGCTTTCAAG ACCAGTTTAA TCATCCCAAC GATCTCTACA 1380 

GACCAAAAGC TCCACAAAGA GGCAACGATG ACTTGGTTGA GGATTTGAAT CCTTCTCCTT 1440 

CGACGCTGAA TCAGAATCTT GGTTTAGTCT TACCTACTGA CTTCAATGGA GGTGAGGAAA 1500 

CAGTAGGAAC AGAGAACAAT CTGCATAATC AAGGGCAAGA GTTGCCCACA TCTTGGATTC 1560 

AGTAAAGAAA GCTTCAGAGT TTTCTTTTTA TGTTTTCTAG TCTTTATAGC TTTGTCTCTT 1620 

GCTTATTCTC TCATTAAACA CAGTTTTTGA TCTCTCCATT TCATAGCCCA TGTAGCAATG 1680 

GAGAAGATTA GGTTTCATAA TAAGTTAATA ACCAAATTCA AA 1722 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 520 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Asp Ser Met Asp Met Tyr Asn Asn Asn lie Gly Met Phe Arg Ser Leu 
15 10 15 

Val Cys Ser Ser Ala Pro Pro Phe Thr Glu Gly His Met Cys Ser Asp 
20 25 30 

Ser His Thr Ala Leu Cys Asp Asp Leu Ser Ser Asp Glu Glu Met Glu 
35 40 45 

lie Glu Glu Leu Glu Lys Lys lie Trp Arg Asp Lys Gin Arg Leu Lys 
50 55 60 

Arg Leu Lys Glu Met Ala Lys Asn Gly Leu Gly Thr Arg Leu Leu Leu 
65 70 7,5 80 

Lys Gin Gin His Asp Asp Phe Pro Glu His Ser Ser Lys Arg Thr Met 
85 90 95 

Tyr Lys Ala Gin Asp Gly lie Leu Lys Tyr Met Ser Lys Thr Met Glu 
100 105 110 

Arg Tyr Lys Ala Gin Gly Phe Val Tyr Gly lie Val Leu Glu Asn Gly 
115 120 125 
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Lys Thr Val Ala Gly Ser Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp 
130 135 140 

Lys Val Arg Phe Asp Arg Asn Gly Pro Ala Ala lie lie Lys His Gin 
145 150 155 160 

Arg Asp He Asn Leu Ser Asp Gly Ser Asp Ser Gly Ser Glu Val Gly 
165 170 175 

Asp Ser Thr Ala Gin Lys Leu Leu Glu Leu Gin Asp Thr Thr Leu Gly 
180 185 190 

Ala Leu Leu Ser Ala Leu Phe Pro His Cys Asn' Pro Pro Gin Arg Arg 
195 200 205 

Phe Pro Leu Glu Lys Gly Val Thr Pro Pro Trp Trp Pro Thr Gly Lys 
210 215 220 

Glu Asp Trp Trp Asp Gin Leu Ser Leu Pro Val Asp Phe Arg Gly Val 
225 230 235 240 

Pro Pro Pro Tyr Lys Lys Pro His Asp Leu Lys Lys Leu Trp Lys He 
245 250 255 

Gly Val Leu He Gly Val He Arg His Met Ala Ser Asp He Ser Asn 
260 265 270 

He Pro Asn Leu Val Arg Arg Ser Arg Ser Leu Gin Glu Lys Met Thr 
275 280 285 

Ser Arg Glu Gly Ala Leu Trp Leu Ala Ala Leu Tyr Arg Glu Lys Ala 
290 295 300 

He Val Asp Gin He Ala Met Ser Arg Glu Asn Asn Asn Thr Ser Asn 
305 310 315 320 

Phe Leu Val Pro Ala Thr Gly Gly Asp Pro Asp Val Leu Phe Pro Glu 
325 330 335 

Ser Thr Asp Tyr Asp Val Glu Leu He Gly Gly Thr His Arg Thr Asn 
340 345 350 

Gin Gin Tyr Pro Glu Phe Glu Asn Asn Tyr Asn Cys Val Tyr Lys Arg 
355 360 365 

Lys Phe Glu Glu Asp Phe Gly Met Pro Met His Pro Thr Leu Leu Thr 
370 375 380 

Cys Glu Asn Ser Leu Cys Pro Tyr Ser Gin Pro His Met Gly Phe Leu 
385 390 395 400 

Asp Arg Asn Leu Arg Glu Asn His Gin Met Thr Cys Pro Tyr Lys Val 
405 410 415 

Thr Ser Phe Tyr Gin Pro Thr Lys Pro Tyr Gly Met Thr Gly Leu Met 
420 425 430 

Val Pro Cys Pro Asp Tyr Asn Gly Met Gin Gin Gin Val Gin Ser Phe 
435 440 445 

Gin Asp Gin Phe Asn His Pro Asn Asp Leu Tyr Arg Pro Lys Ala Pro 
450 455 460 

Gin Arg Gly Asn Asp Asp Leu Val Glu Asp Leu Asn Pro Ser Pro Ser 
465 470 475 480 

Thr Leu Asn Gin Asn Leu Gly Leu Val Leu Pro Thr Asp Phe Asn Gly 
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485 490 495 

Gly Glu Glu Thr Val Gly Thr Glu Asn Asn Leu His Asn Gin Gly Gin 
500 505 510 

Glu Leu Pro Thr Ser Trp lie Gin 
515 520 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 06 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TTCCCCTGAG 


AACGACAGGA 


GAAAGAATAA 


AAACCCTAAA 


TTTCTTTAAT 


TTCGGCGCTT 


60 


CAGATTATCG 


TTGTTAAAGG 


TTTTTGATTG 


ATTTTGTTTA 


AATGGGCGAT 


CTTGCTATGT 


120 


CCGTAGCAGA 


CATCAGGATG 


GAGAATGAGC 


CTGATGATTT 


AGCTAGTGAT 


AATGTTGCTG 


180 


AGATTGATGT 


GAGTGATGAA 


GAGATTGATG 


CTGACGACCT 


TGAGAGACGG 


ATGTGGAAAG 


240 


ATCGTGTCAG 


GCTTAAAAGA 


ATCAAAGAGC 


GACAAAAAGC 


TGGCTCTCAA 


GGAGCTCAAA 


300 


ACGAAGGGAG 


ACACCTAAGA 


AAATCTCTGA 


TCAAGCTCAG 


AGGAAGAAAA 


TGTCTTAGAG 


360 


CTCAAGATGG 


TATCCTTAAG 


TACATTGTTG 


AAGCTTATGG 


AAGTCTGCAA 


AGTTCGCGGG 


420 


TTTGTCTATG 


GTATAATACC 


GGAAAAGGGC 


AAGCCTGTGA 


GTTGGCTCCT 


CTGACAATAT 


480 


AAGAGCTTGG 


TGGAAAGAGA 


AAGTGAAGTT 


TGATAAGAAC 


GGTCCTGCTG 


CTATTGCTAA 


540 


ATACGAAGAG 


GAGTGTTTAG 


CGTTTGGGAA 


ATCTGATGGG 


AATAGGAATT 


CACAGTTTGT 


600 


TCTCCAGGAT 


TTGCAAGATG 


CTACTTTAGG 


GTCTTTGTTA 


TCTTCTTTGA 


TGCAACATTG 


660 


TGATCCTCCT 


CAAAGGAAGT 


ATCCGTTGGA 


GAAAGGGACG 


CCTCCGCCTT 


GGTGGCCAAC 


720 


GGGGAATGAA 


GAATGGTGGG 


TGAAACTCGG 


TCTGCCTAAA 


AGCCAGAGTC 


CTCCTTACCG 


780 


AAAACCTCAT 


GATCTCAAGA 


AGATGTGGAA 


GGTTGGAGTT 


TTAACGGCAG 


TGATCAATCA 


B40 


TATGTTACCT 


GATATTGCAA 


AGATTAAGAG 


GCATGTTCGT 


CAGTCGAAAT 


GTTTACAGGA 


900 


CAAGATGACA 


GCTAAAGAGA 


GTGCGATTTG 


GTTGGCGGTT 


TTGAACCAAG 


AGGAATCTTT 


960 


GATTCAGCAG 


CCTAGCAGTG 


ACAATGGAAA 


CTCCAATGTG 


ACTGAGACAC 


ATCGTAGGGG 


1020 


TAATAACGCT 


GACAGGAGGA 


AACCTGTGGT 


CAACAGTGAC 


AGTGACTATG 


ATGTTGATGG 


1080 


GACAGAGGAA 


GCTTCAGGTT 


CAGTTTCATC 


TAAAGACAGT 


AGAAGAAATC 


AGATTCAAAA 


1140 


AGAACAACCA ACAGCCATCT 


CACATTCAGT 


AAGAGATCAA 


GATAAAGCAG 


AGAAACATCG 


1200 
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CAGAAGGAAA 


AGACCTCGAA 


TTAGATCCGG 


AACTGTCAAT 


CGACAAGAGG 


AAGAACAACC 


1260 


TGAAGCTCAA 


CAAAGAAACA 


TCTTACCTGA 


TATGAATCAT 


GTTGATGCCC 


CTCTGCTAGA 


1320 


ATATAACATC 


AACGGTACTC 


ATCAAGAGGA 


CGATGTTGTC 


GACCCAAATA 


TTGCCTTAGG 


1380 


ACCAGAGGAT 


AATGGTCTGG 


AACTAGTGGT 


TCCTGAGTTC 


AATAACCAAA 


CATACTTATC 


1440 


TTCCACTTGT 


TAATGAACAA 


ACT ATGATG C 


CTGTAGACGA 


AAGGCCAATG 


CTTTATGGAC 


1500 


CCAAACCCTA 


ACCAAGAGCT 


TCAATTTGGG 


TCAGGGTACA 


ACTTCTACAA 


TCCCTCTGCA 


1560 


GTGTTTGTAC 


ATAACCAGGA 


AGACGACATT 


CTCCATACAC 


AGATAGAAAT 


GAATACACAA 


1620 


GCACCACCTC 


ACAACAGTGG 


GTTCGAGGAG 


GCCCCAGGAG 


GAGTACTTCA 


ACCCCTTGGT 


1680 


TTACTCGGAA 


ATGAAGACGG 


TGTAACAGGG 


AGTGAGTTGC 


CTCAGTATCA 


GAGTGGCATT 


1740 


CTGTCTCCAT 


TGACTGACTT 


GGACTTTGAC 


TATGGTGGTT 


TTGGTGATGA 


TTTCTCATGG 


1800 


TTTGGAGCTT 


AGTGTCTTGC 


CATTITT'ITT 


GGGAGATTAC 


ATAGTTCAAA 


AGGACATGGC 


1860 


AATAGTCTGG 


CTAGTACAGT 


TACTTTCTCT 


TCTTCATTTC 


TTCTGATCTT 


ATATTCTTCC 


1920 


TCTTTTTTTC 


TTATAATATT 


TTCTTAGATT 


TGTTAAGAGA AACAATTTTC 


CTTTTGAATA 


1980 


AGTTGCCAGA 


AGAACTGCTT 


TGCCCGTTGT 


AATGGTCTCT 


AGGGAAAGCA 


GTTAGCGTAT 


2040 


CATCATTTGT 


AAATTTACCT 


GTGAG 








2065 



(2) INFORMATION FOR SEQ ID NO: 12: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 567 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Asp Leu Ala Met Ser Val Ala Asp He Arg Met Glu Asn Glu 
15 10 15 

Pro Asp Asp Leu Ala Ser Asp Asn Val Ala Glu He Asp Val Ser Asp 
20 25 30 

Glu Glu He Asp Ala Asp Asp Leu Glu Arg Arg Met Trp Lys Asp Arg 
35 40 45 

Val Arg Leu Lys Arg He Lys Glu Arg Gin Lys Ala Gly Ser Gin Gly 
50 55 60 

Ala Gin Thr Lys Glu Thr Pro Lys Lys He Ser Asp Gin Ala Gin Arg 
65 70 75 80 

Lys Lys Met Ser Arg Ala Gin Asp Gly He Leu Lys Tyr Met Leu Lys 
85 90 95 

Leu Met Glu Val Cys Lys Val Arg Gly Phe Val Tyr Gly He He Pro 
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100 105 110 

Glu Lys Gly Lys Pro Val Ser Gly Ser Ser Asp Asn lie Arg Ala Trp 
115 120 125 

Trp Lys Glu Lys Val Lys Phe Asp Lys Asn Gly Pro Ala Ala lie Ala 
130 135 140 

Lys Tyr Glu Glu Glu Cys Leu Ala Phe Gly Lys Ser Asp Gly Asn Arg 
145 150 155 160 

Asn Ser Gin Phe Val Leu Gin Asp Leu Gin Asp Ala Thr Leu Gly Ser 
165 170 175 

Leu Leu Ser Ser Leu Met Gin His Cys Asp Pro Pro Gin Arg Lys Tyr 
180 185 190 

Pro Leu Glu Lys Gly Thr Pro Pro Pro Trp Trp Pro Thr Gly Asn Glu 
195 200 205 

Glu Trp Trp Val Lys Leu Gly Leu Pro Lys Ser Gin Ser Pro Pro Tyr 
210 215 220 

Arg Lys Pro His Asp Leu Lys Lys Met Trp Lys Val Gly Val Leu Thr 
225 230 235 240 

Ala Val He Asn His Met Leu Pro Asp He Ala Lys He Lys Arg His 
245 250 255 

Val Arg Gin Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser 
260 265 270 

Ala He Trp Leu Ala Val Leu Asn Gin Glu Glu Ser Leu He Gin Gin 
275 280 285 

Pro Ser Ser Asp Asn Gly Asn Ser Asn Val Thr Glu Thr His Arg Arg 
290 295 300 

Gly Asn Asn Ala Asp Arg Arg Lys Pro Val Val Asn Ser Asp Ser Asp 
305 310 315 320 

Tyr Asp Val Asp Gly Thr Glu Glu Ala Ser Gly Ser Val Ser Ser Lys 
325 330 335 

Asp Ser Arg Arg Asn Gin He Gin Lys Glu Gin Pro Thr Ala He Ser 
340 345 350 

His Ser Val Arg Asp Gin Asp Lys Ala Glu Lys His Arg Arg Arg Lys 
355 360 365 

Arg Pro Arg life Arg Ser Gly Thr Val Asn Arg Gin Glu Glu Glu Gin 
370 375 380 

Pro Glu Ala Gin Gin Arg Asn He Leu Pro Asp Met Asn His Val Asp 
385 390 395 400 

Ala Pro Leu Leu Glu Tyr Asn He Asn Gly Thr His Gin Glu Asp Asp 
405 410 415 

v. 

Val Val Asp Pro Asn He Ala Leu Gly Pro Glu Asp Asn Gly Leu Glu 
420 425 430 

Leu Val Val Pro Glu Phe Asn Asn Asn Tyr Thr Tyr Leu Pro Leu Val 
435 440 445 

Asn Glu Gin Thr Met Met Pro Val Asp Glu Arg Pro Met Leu Tyr Gly 
450 455 460 
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Pro Asn Pro Asn Gin Glu Leu Gin Phe Gly Ser Gly Tyr Asn Phe Tyr 
465 470 475 480 

Asn Pro Ser Ala Val Phe Val His Asn Gin Glu Asp Asp lie Leu His 
485 490 495 

Thr Gin He Glu Met Asn Thr Gin Ala Pro Pro His Asn Ser Gly Phe 
500 505 510 

Glu Glu Ala Pro Gly Gly Val Leu Gin Pro Leu Gly Leu Leu Gly Asn 
. 515 520 525 

Glu Asp Gly Val Thr Gly Ser Glu Leu Pro Gin Tyr Gin Ser Gly He 
530 535 540 

Leu Ser Pro Leu Thr Asp Leu Asp Phe Asp Tyr Gly Gly Phe Gly Asp 
545 550 555 560 

Asp Phe Ser Trp Phe Gly Ala 
565 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE,: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

Met Thr Val Val Arg Glu Tyr Asp Pro Thr Arg Asp Leu Val Gly Val 
15 10 15 

Glu Asp Val Glu Arg Arg Cys Glu Val Gly Pro Ser Gly Lys Leu Ser 
20 25 30 

Leu Phe Thr Asp Leu Leu Gly Asp Pro He Cys Arg He Arg His Ser 
35 40 45 

Pro Ser Tyr Leu Met Leu Val Ala Glu Met Gly Thr Glu Xaa Xaa Xaa 
50 55 60 

Lys Lys Glu He Val Gly Met He Arg Gly Cys He Lys Thr Val Thr 
65 70 75 80 

Cys Gly Gin Lys Leu Asp Leu Asn His Lys Xaa Xaa Xaa Ser Gin Asn 
85 90 95 

Asp Val Val Xaa Xaa Lys Pro Leu Tyr Thr Lys Leu Xaa Xaa Xaa Xaa 
100 105 110 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ala Tyr Val Leu Gly Leu Arg Val 
115 120 125 

Ser Pro Phe His Arg Arg Gin Gly He Gly Phe Lys Leu Val Lys Met 
130 135 140 
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Met Glu Glu Trp Phe Arg Gin 
145 150 



Xaa Asn Gly Ala Glu Tyr Ser Tyr lie 
155 160 



Ala Thr Glu Asn Asp Xaa Xaa 
165 



Xaa Xaa Asn Gin Ala Ser Val Asn Leu 



170 175 



Phe Thr Gly Lys Cys Gly Tyr 
180 



Ser Glu Phe Arg Thr Pro Ser lie Leu 



185 190 



Val Asn Pro Val Tyr Ala His 



195 



Arg Val Asn Val Ser Arg Arg Val Thr 
200 205 



Val He Lys Leu Glu Pro, Val 



210 . 215 



Asp Ala Glu Thr Xaa Xaa Xaa Leu Tyr 
220 



Arg He Arg Phe Ser Thr Thr 
225 230 



Glu Phe Phe Xaa Xaa Xaa Xaa Xaa Xaa 
235 240 



(2) INFORMATION FOR* SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1702 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTCCAACTTT TAAAACTCAT CATAAATAGT AAAAAAGTAG CCGGAAAAAT AAAATAAAAA 60 

GTCTATTTCT CTTTCCTTTA AAATCCAAAT CCTATAAACT CATAGCTTTC TCTGTTCTTT 120 

ACTTATACCT CACGTTATAC ATATATATAG AGTTTCTATA AATGCTTCTC TTTCCTCTCG 180 

AACAAATCTT CCTCACTTCT CTCATTTCCA CACTCACCTT CCTCTCTATA TATTAAACCC 240 

TATCTACTTA ACTCTTCTTC TAACTCTAAT CTCTCTCTCT ATTTACTCTG CTTCTGTTCT 300 

CACTCTGAAA GAACCAAAAC ATGACGGTGG TTAGAGAGTA CGACCCGACC CGAGACTTAG 360 

TCGGCGTGGA GGACGTGGAA CGACGGTGTG AAGTCGGACC AAGCGGCAAG CTTTCTCTTT 420 

TCACCGACCT TTTGGGTGAC CCGATTTGTA GAATCCGACA TTCACCTTCC TATCTCATGC 480 

TGGTGGCTGA GATGGGTACG GAGAAGAAGG AGATAGTGGG CATGATTAGA GGATGTATCA 540 

AAACCGTTAC ATGTGGCCAA AAACTCGATT TAAATCACAA ATCTCAAAAC GATGTCGTTA 600 

AGCCTCTTTA CACTAAACTC GCTTACGTCT TGGGCCTTCG CGTCTCTCCT TTTCACAGGA 660 

GACAAGGGAT TGGGTTTAAG CTCGTGAAGA TGATGGAGGA ATGGTTTAGA CAAAACGGAG 720 

CTGAGTATTC GTATATTGCA ACTGAGAACG ATAATCAAGC TTCTGTGAAT TTGTTCACCG 780 

GGAAATGTGG TTATTCGGAG TTTCGTACAC CGTCGATTTT GGTTAACCCG GTTTACGCTC 840 

ATCGAGTTAA TGTTTCGCGG CGAGTCACGG TTATCAAGTT AGAGCCGGTT GATGCTGAGA 900 
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CGTTGTACCG AATCCGGTTT AGCACAACAG AGTTTTTCCC GCGGGATATT GATTCGGTAC 960 

TTAATAACAA ACTCTCGCTT GGGACTTTCG TCGCGGTGCC ACGTGGAAGC TGTTATGGAT 1020 

CCGGGTCTGG ATCATGGCCC ' GGTTCGGCTA AATTCCTCGA ATATCCACCC GAGTCATGGG 1080 

CCGTATTAAG CGTGTGGAAT TGTAAAGACT CGTTTCTGTT AGAAGTACGT GGAGCGTCGA 1140 

GATTGAGACG TGTGGTGGCT AAAACGACGC GAGTAGTTGA TAAAACGTTG CCGTTTCTGA 1200 

AACTACCTTC GATACCGTCC GTTTTCGAAC CTTTTGGACT TCATTTTATG TATGGAATCG 1260 

GAGGAGAAGG TCCACGCGCG GTGAAGATGG TGAAATCCTT GTGTGCTCAC GCGCATAACT 1320 

TGGCTAAGGC AGGTGGTTGT GGTGTCGTGG CGGCGGAAGT TGCCGGAGAA GACCCGTTGC 1380 

GGCGAGGAAT ACCACATTGG AAAGTGCTAT CGTGTGACGA GGATCTTTGG TGTATAAAGC 1440 

GGCTTGGAGA TGACTATAGT GATGGTGTTG TTGGTGATTG GACTAAATCG CCACCTGGCG 1500 

TTTCCATTTT TGTAGACCCT AGAGAATTTT AAAACTTTTT TTTTAACTCT ATAATATATA 1560 

TTCTCTATTA ACCACTTGAT GTTAAATTAG GGGTTTTCTT CTAAGTTTAT AGATTTTCTT 1620 

GTTTTAGAAT TAATCTTTTT TTTAGGTAAC TTTTTTTGCT TTTTGTTTTG TTTTGTTTTG 1680 

TTTTTGTGGG TGTTATAAAT TA 1702 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4146 base pairs 

(B) TYPE: .nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TGTCATAATC AGTACAAAAT AAATCACCTA CCAACCTGAA CTATATGTTA TATATTTTGA 60 

GGGGCCACGT CAAGTGTGCC GTTTATTTTT GTGTTTATGA TTGTTTAATA TTTGTGCGTG 120 

TGATGGTGTT TCTTGCTTAG TTTCCACTTA ATACACAATC AAATATCAAG TGGAACTATT 180 

TATGAAAATT GTTCTTCGAG AAGAATTCTG ACCCTAAAAG GTCATTTGAG GGCTTGAGGC 240 

TTATTGTTTC CAAATTACAC CAGTAAACAA GGGTTTTTTT TTGTCAACAA AGATTATTGT 300 

AATTCGAATT TCGTCTACAA TAAAACAATT TTCTTACTAA AACAAAACAA TTAGCTGACG 360 

GTTGATATTT CGGCTTTTGA GTTTAATTAA CTAATTGGTG ATTATGTTGA TGATCTTTCA 420 

CACCTAATGA AGTGTCATGT ATATGTATAT ATGTATATAC TTATGTATAT ATAAAACGTA 480 

CATATAATCA TTTGTCATAT ATATCATCAT GTATTGCATG ACTAAACTAC CCTTAAAAGA 540 

GGAATACGAT AGACATGACC TTTAGGAATT TGTTTTTTTC TTCTAAATGG ATTCCTTCGC 600 

TTCTTTTTAG CCTCGTAGTG AATTTGAACA TTGCAGTTAT TTCTAGTAAG ATATTTTTTC 660 
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TGTATTTTTC 


GGAAAATGTT 


AAAAACTAAT 


TATACACAAT 


TTACTTTCTC 


TCTCAACTCT 


720 


TATTTTACGT 


TACTGTTTTT 


TTTTTCCTCT 


TGCAAAATTA 


GAGCTGATGT 


ATTTACATTT 


780 


' ACTAGTAATT 


TGGTAGATAG 


ACAGTTAATG 


TAGTATATAG 


ATGGGGTTGA 


GGGCAAATGA 


640 


TTACTTGGGA 


GATGGTGCAA 


TGCATCAGAG 


TGATGATGTG 


GAATTTAATA 


AGTGTGAATT 


900 


TATGGGCAAA 


GGAAGGGAAC 


TAGTAGTAGA 


AAGGGAAATA 


AATACAGTAC 


AAGTAAGAGG 


960 


AAAACGAAAA 


GAGAGATAGA 


AACCATAATA 


ATGAGTTAAC 


GCAGACATAG 


CCGCCATTTT 


1020 


CAACTTCTCA 


CTCCCACTTA 


CAACTTCTCC 


1TCTGGGCAA 


GTTTTCCACA 


TCAATGCTCG 


1080 


TCTTAATCAC 


CATTAATCTC 


TACTCATCAT 


TAATACGTTG 


AAGCCCACTA 


TTTCAAAATT 


1140 


TACTAGGAGT 


ATTTATTCGT 


GAAAAACATT 


TAAATGTCCC 


TAATTATAAG 


AGATTTAATT 


1200 


TCATATTTAT 


TGTATTAAAG 


AGAATTTACA 


TTAGCTGTCA 


AAAAAAAAAA 


AAAAAGAGAA 


1260 


TTAACATTAT 


TTTACAGAAC 


ATAAAATTTT 


GAAAATAGAT 


AGCGCCACTG 


CATGTAAGAA 


1320 


CATACAAATT 


TCTTTTTTTC 


AACAAAATCT 


ATTTATATTT 


CTTCTTTTTT 


TGAACATTAT 


1380 


GTGTAGTTTG 


TAGTAAACTA 


AAAAGTGTGG 


ACCAACACAA 


TTTAAATCAT 


TCGATTTTGT 


1440 


AGCAAAAACA 


TTTTTGTTCC 


AATTTCCAAG 


CAGCAAATAT 


GGAAGGAATA 


TAAATTCTTT 


1500 


ACTATTTTTC 


CTCTTAACAC 


ATAAAAGTAA 


AAAAAGCATT 


CAATGATCAG 


TTAAAATCTG 


1560 


GTTAGAATTC 


TACCTTATCA 


TTTAGAACTA 


GCTAATATTT 


AAATTCATAT 


ATACAAAAAA 


1620 


TAAAATGGGA 


ACTGTAGAGA 


CTAGAGACTA 


TAAATAGAGG 


ATTGAGAAGA 


AGAACTTTTA 


1680 


AAGCTCTATC 


AATCATGAAC 


TACTCGCCTT 


CTCCAACTTT 


TAAAACTCAT 


CATAAATAGT 


1740 


AAAAAAGTAG 


CCGGAAAAAT 


AAAATAAAAA 


GTCTATTTCT 


CTTTCCTTTA 


AAATCCAAAT 


1800 


CCTATAAACT 


CATAGCTTTC 


TCTGTTCTTT 


ACTTATACCT 


CACGTTATAC 


ATATATATAG 


1860 


AGTTTCTATA 


AATGCTTCTC 


TTTCCTCTCG 


AACAAATCTT 


CCTCACTTCT 


CTCATTTCCA 


1920 


CACTCACCTT 


CCTCTCTATA 


TATTAAACCC 


TATCTACTTA 


ACTCTTCTTC 


TAACTCTAAT 


1980 


CTCTCTCTCT 


ATTTACTCTG 


CTTCTGTTCT 


CACTCTGAAA 


GAACCAAAAC 


ATGACGGTGG 


2040 


TTAGAGAGTA 


CGACCCGACC 


CGAGACTTAG 


TCGGCGTGGA 


GGACGTGGAA 


CGACGGTGTG 


2100 


AAGTCGGACC 


AAGCGGCAAG 


CTTTCTCTTT 


TCACCGACCT 


TTTGGGTGAC 


CCGATTTGTA 


2160 


GAATCCGACA 


TTCACCTTCC 


TATCTCATGC 


TGGTAATAAC 


ATGTTTCACA 


ATCTTTTATC 


2220 


TTCTTTTACT 


TGTATGTCTC 


TTCAAAAACT 


CTGTTTGTTT 


TTTGAACCTA 


GAAGTAGAAA 


2280 


ACATAGAACA 


CCAACTTCTC 


AACCTTTGGT 


TAATCCAAAA 


AACCCATTTT 


CCATAAACAA 


2340 


TTAAAGTTCG 


GTTCTTTTTT 


TGGTATCATT 


TCTATTTTTT 


TCCGATTCTT 


GATAAGATCA 


2400 


AAAGACTCAT 


CATTTATATT 


ATTTTTTGCA 


ACCAAATGAT 


ACCCGAGTAA 


CTATAACTAA 


2460 


TAAAGTTTCC 


TCTTTATTAT 


AAAAGGTTAA 


AAACATATAA 


TAACGGAAAA 


TTTAAATTAT 


2520 


GGGACTGTAA 


CAGGTGGCTG 


AGATGGGTAC 


GGAGAAGAAG 


GAGATAGTGG 


GCATGATTAG 


2580 


AGGATGTATC 


AAAACCGTTA 


CATGTGGCCA 


AAAACTCGAT 


TTAAATCACA 


AATCTCAAAA 


2640 


CGATGTCGTT 


AAGCCTCTTT 


ACACTAAACT 


CGCTTACGTC 


TTGGGCCTTC 


GCGTCTCTCC 


2700 
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TTTTCACAGG 


TACCCTTCCG 


TTTTCCTCCC 


ACTCATAATC 


ACACGCTATT 


ATAGATTTTG 


2760 


GTTATCTAAA 


CTAGTTTTGG 


TTTTTGCAGG 


AGACAAGGGA 


TTGGGTTTAA 


GCTCGTGAAG 


2820 


ATGATGGAGG 


AATGGTTTAG 


ACAAAACGGA 


GCTGAGTATT 


CGTATATTGC 


AACTGAGAAC 


2680 


GATAATCAAG 


CTTCTGTGAA 


TTTGTTCACC 


GGGAAATGTG 


GTTATTCGGA 


GTTTCGTACA 


2940 


CCGTCGATTT 


TGGTTAACCC 


GGTTTACGCT 


CATCGAGTTA 


ATGTTTCGCG 


GCGAGTCACG 


3000 


GTTATCAAGT 


TAGAGCCGGT 


TGATGCTGAG 


ACGTTGTACC 


GAATCCGGTT 


TAGCACAACA 


3060 


GAGTTTTTCC 


CGCGGGATAT 


TGATTCGGTA 


CTTAATAACA AACTCTCGCT 


TGGGACTTTC 


3120 


GTCGCGGTGC 


CACGTGGAAG 


CTGTTATGGA 


TCCGGGTCTG 


GATCATGGCC 


CGGTTCGGCT 


3180 


AAATTCCTCG 


AATATCCACC 


CGAGTCATGG 


GCCGTATTAA 


GCGTGTGGAA 


TTGTAAAGAC 


3240 


TCGTTTCTGT 


TAGAAGTACG 


TGGAGCGTCG 


AGATTGAGAC 


GTGTGGTGGC 


TAAAACGACG 


3300 


CGAGTAGTTG 


ATAAAACGTT 


GCCGTTTCTG 


AAACTACCTT 


CGATACCGTC 


CGTTTTCGAA 


3360 


CCTTTTGGAC 


TTCATTTTAT 


GTATGGAATC 


GGAGGAGAAG 


GTCCACGCGC 


GGTGAAGATG 


3420 


GTGAAATCCT 


TGTGTGCTCA 


CGCGCATAAC 


TTGGCTAAGG 


CAGGTGGTTG 


TGGTGTCGTG 


3480 


GCGGCGGAAG 


TTGCCGGAGA 


AGACCCGTTG 


CGGCGAGGAA TACCACATTG 


GAAAGTGCTA 


3540 


TCGTGTGACG 


AGGATCTTTG 


GTGTATAAAG 


CGGCTTGGAG 


ATGACTATAG 


TGATGGTGTT 


3600 


GTTGGTGATT 


GGACTAAATC 


GCCACCTGGC 


GTTTCCATTT 


TTGTAGACCC 


TAGAGAATTT 


3660 


TAAAACTTTT 


TTTTTAACTC 


TATAATATAT 


ATTCTCTATT 


AACCACTTGA 


TGTTAAATTA 


3720 


GGGGTTTTCT 


TCTAAGTTTA 


TAGATTTTCT 


TGTTTTAGAA 


TTAATCTTTT 


TTTTAGGTAA 


3780 


CTTTTTTTGC 


TTTTTGTTTT 


GTTTTGTTTT 


GTTTTTGTGG 


GTGTTATAAA 


TTAGTGGTAA 


3840 


GAGGTAATAT 


CTCCTACTTT 


TGGGTTTGTG 


TCTTCTTGTC 


TTGTAAATGG 


ATCTAGCTTT 


3900 


TTAAGATACT 


TTTTCTTTGT 


GGCCAAACCA 


AAACGCCGAC 


CTGATTATTA 


TTTCCAAGTA 


3960 


GATAAAATTT 


CATGAACGCA 


CTGATACGTA 


TAATGATGCA ATTTGTGTTA AGACGATACT 


4020 


TTGGAGATAA 


AATTACAATA 


TGACAATGAT 


AGAAAATGTT ACCAATAACG ATTAGCATTA 


4080 


TCGTGTGTGC 


CATCAAGTAT 


AACTAAGAGA 


AAGACGCACA 


TTTTCTTTAA 


GAGTAAATAA 


4140 


AATATT 












4146 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 398 amino acids 

(B) TYPE: amino acid 

<C} STRANpEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
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Met Thr Val Val Arg Glu Tyr Asp Pro Thr Arg Asp Leu Val Gly Val 
1 S 10 15 

Glu Asp Val Glu Arg Arg Cys Glu Val Gly Pro Ser Gly Lys Leu Ser 
20 25 30 

Leu Phe Thr Asp Leu Leu Gly Asp Pro He Cys Arg He Arg His Ser 
35 40 45 

Pro Ser Tyr Leu Met Leu Val Ala Glu Met Gly Thr Glu Lys Lys Glu 
50 55 60 

He Val Gly Met He Arg Gly Cys lie Lys Thr Val Thr Cys Gly Gin 
65 70 75 80 

Lys Leu Asp Leu Asn His Lys Ser Gin Asn Asp Val Val Lys Pro Leu 
85 90 95 

Tyr Thr Lys Leu Ala Tyr Val Leu Gly Leu Arg Val Ser Pro Phe His 
100 105 110 

Arg Arg Gin Gly He Gly Phe Lys Leu Val Lys Met Met Glu Glu Trp 
115 120 125 

Phe Arg Gin Asn Gly Ala Glu Tyr Ser Tyr He Ala Thr Glu Asn Asp 
130 135 140 

Asn Gin Ala Ser Val Asn Leu Phe Thr Gly Lys Cys Gly Tyr Ser Glu 
145 150 155 160 

Phe Arg Thr Pro Ser He Leu Val Asn Pro Val Tyr Ala His Arg Val 
165 170 175 

Asn Val Ser Arg Arg Val Thr Val He Lys Leu Glu Pro Val Asp Ala 
160 185 190 

Glu Thr Leu Tyr Arg He Arg Phe Ser Thr Thr Glu Phe Phe Pro Arg 
195 200 205 

Asp He Asp Ser Val Leu Asn Asn Lys Leu Ser Leu Gly Thr Phe Val 
210 215 220 

Ala Val Pro Arg Gly Ser Cys Tyr Gly Ser Gly Ser Gly Ser Trp Pro 
225 230 235 240 

Gly Ser Ala Lys Phe Leu Glu Tyr Pro Pro Glu Ser Trp Ala Val Leu 
245 250 255 

Ser Val Trp Asn Cys Lys Asp Ser Phe Leu Leu Glu Val Arg Gly Ala 
260 265 270 

Ser Arg Leu Arg Arg Val Val Ala Lys Thr Arg Arg Val Val Asp Lys 
275 280 285 

Thr Leu Pro Phe Leu Lys Leu Pro Ser He Pro Ser Val Phe Glu Pro 
290 295 300 

Phe Gly Leu His Phe Met Tyr Gly He Gly Gly Glu Gly Pro Arg Ala 
305 310 315 320 

Val Lys Met Val Lys Ser Leu Cys Ala His Ala His Asn Leu Ala Lys 
325 330 335 

Ala Gly Gly Cys Gly Val Val Ala Ala Glu Val Ala Gly Glu Asp Pro 
340 345 350 

Leu Arg Arg Gly He Pro His Trp Lys Val Leu Ser Cys Asp Glu Asp 
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355 360 365 

Leu Trp Cys lie Lys Arg Leu Gly Asp Asp Tyr Ser Asp Gly Val Val 
370 375 380 

Gly Asp Trp Thr Lys Cys His Leu Ala Phe Pro Phe Leu Glx 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 17 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GAGTTGCGCA TG 12 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
Gly Val Ala His 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: - nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 
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TGCTACAATC AGAATTCTTG CAGT 24 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Ala Thr He Arg He Leu Ala Val 
1 5 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GGATCCTCTA GTCAAATTAC CGC 23 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
AGATCTGGTA TATTCCGTCT GCAC * ■ 24 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 23 
CCGGATTCGG TTTGTAGC 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GACGTGCATG TTCTTGGG 
(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GAAAGCCACA TCACCTGC 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GGGGTGGAGT TATCCAC 17 
(2) INFORMATION FOR- SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GACACCGGGA AGTATCG 17 
(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2B: 
CTGCTTTCAT AGAAGAGGC 19 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GTCAGAACAA ACCTGCTCC 19 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CACCCAGGTC TTGGTGG 

(2) INFORMATION FOR. SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 
GGCCGCCATG GATGCG 

(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TCTCAATCAA GAGGAGGC 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
CTTGAAGGAT CCGAGTGG 18 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
CAGGTTGGCG AGTTCCTCG 19 
(2) INFORMATION FOR- SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CTTGCTGTTA TTCTCCATGC 20 
(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 
CCCTGGACCA GCTCCTGG 
(2) INFORMATION FOR' SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
TGGCGCAAGC ATCGTCCC 18 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



18 



4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
AAATGTTCAG GAATCTCTCG 20 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CTGGCTGGCA GCCACGCC 18 
(2) INFORMATION FOR SEQ ID NO: 40: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GCGTTCTCAA AGCTGCGG 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE* YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
ACTGATGGGT CTTCTGGG 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
GGATCAGGAT GGACCCGG 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
TGGTTGCTGA AGCCAGGG 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TCCATTCATA GAGAGTGGG 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE*: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 
ATGCCCAAGA ACATGCACG 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
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2 

CAACTGAT CC TTTACCCTGC 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
{iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 
GTTGTTAGGT CAACTTGCG 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE :. nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CTCTGTTAGG GCTTCCTCC 
(2) INFORMATION FOR SEQ ID NO:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 
GAATCAGATT TCGCGAGG 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 
GTCCAAATGG AGGAAGCC 
(2) INFORMATION FOR SEQ ID NO: 51: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CCACGACTGT ACAATTGACC TTG 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CATGATCGCA AGTTGACC 18 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: * 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
AGAAAACTCT TATCAAGCTA CG 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
AAGCTTATGG GTGCTCGTGC 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GGAAAGAGAG AAAGACTCAG 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL : NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GCCACCAAGT CATACCCG 
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(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
{iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
CCTTCTATAT TTGGTTCC 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCATTCTCCG GAATAATCC 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CACGGAGCAG GATAAGGGTA 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL : NO 
(iv) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60: 
CGGATTGGAT TGTGTGTGC 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base paiis 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
CGCCACTGCA TGTAAGAAC 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
TCCACACGCT TAATACGGC 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: 
GGTACGGAGA AGAAGGAG 18 
(2) INFORMATION TOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
CGCGGGATAT TGATTCGGT 19 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GTGTTGAACA CG CCCACAA 19 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
ACGACACCAC AACCACCT 18 
(2) INFORMATION FOR SEQ ID NO: 67: 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE : YES 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GACAAGAAGA CACAAACC 18 
(2) INFORMATION FOR. SEQ ID NO: 68: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 
GAATCGGAGG AGAAGGTC 18 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
1 .5 10 15 

v. 

Xaa Met Phe Gly Tyr Arg Ser Asn Val Pro Lys Val Arg Leu Thr Thx 
20 25 30 

Asp Arg Leu Val Val Arg Leu Val His Asp Arg Asp Ala Trp Arg Leu 
35 40 45 

Ala Asp Tyr Tyr Ala Glu Asn Arg His Phe lieu Lys Pro Trp Glu Pro 
50 55 60 
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Val Arg Asp Glu Ser His Cys Tyr Pro Ser Gly Trp Gin Ala Arg Leu 
65 70 75 80 

Gly Met lie Ash Glu Phe His Lys Gin Gly Ser Ala Phe Tyr Phe Gly 
85 90 95 

Leu Phe Asp Pro Asp Glu Lys Glu lie He Gly Val Ala Asn Phe Ser 
100 105 110 

Asn Val Val Arg Gly Ser Phe His Ala Cys Tyr Leu Gly Tyr Ser He 
115 120 125 

Gly Gin Lys Trp Gin Gly Lys Gly Leu Met Phe Glu Ala Leu Thr Ala 
130 135 140 

Ala He Arg Tyr Met Gin Arg Thr Gin His He His Arg He Met Ala 
145 150 155 160 

Asn Tyr Met Pro His Xaa Xaa Xaa Xaa Asn Lys Arg Ser Gly Asp Leu 
165 170 175 

Leu Ala Arg Leu Gly Phe Glu Lys Glu Gly Tyr Ala Lys Asp Tyr Leu 
180 185 190 

Leu He Asp Gly Gin Trp Arg Asp His Val Leu Thr Ala Leu Thr Thr 
195 200 205 

Pro Asp Trp Thr Pro Gly Arg Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<DT TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
15 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Glu Thr Glu He, Lys Val Ser 
20' 25 30 

Glu Ser Leu Glu Leu His Ala Val Ala Glu Asn His Val Lys Pro Leu 
35 40 45 

Tyr Gin Leu He Cys Lys Asn Lys Thr Trp Leu Gin Gin Ser Leu Asn 
50 55 60 

Trp Pro Gin Phe Val Gin Ser Glu Glu Asp Thr Arg Lys Thr Val Gin 
65 70 75 80 
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Gly Asn Val Xaa Met 
85 



lie Phe Xaa Xaa Lys 
100 

Asn Arg lie Glu Pro 
115 

Asp Glu Ser His Gin 
130 

Leu lie His His Tyr 
145 

Lys Cys Arg Val Asp 
165 

Ala Leu Arg Asn Gly 
180 

Phe Leu Asn Asp Ala 
195 

Asp Ser Gin Xaa Xaa 
210 

Xaa Xaa Xaa Xaa Xaa 
225 



Leu His Gin Arg Gly Tyr 
90 

Glu Asp Glu Leu lie Gly 
105 

Leu Asn Lys Thr Ala Glu 
12 0 

Gly Gin Gly lie He Ser 
135 

Ala Gin Ser Gly Glu Leu 
150 155 

Xaa Xaa Xaa Xaa Asn Pro 
170 

Phe He Leu Glu Gly Cys 
185 

Tyr Asp Asp Val Asn Leu 
200 

Xaa Xaa Xaa Xaa Xaa Xaa 
215 

Xaa Xaa Xaa Xaa Xaa Xaa 
230 235 



Ala Lys Met Phe Met 
95 

Val He Ser Phe Xaa 
110 

He Gly Tyr Trp Leu 
125 

Gin Ala Leu Gin Ala 
14 0 

Arg Arg Phe Val He 
160 

Gin Ser Asn Gin Val 
175 

Leu Lys Gin Ala Glu 
190 

Tyr Ala Arg lie He 
205 

Xaa Xaa Xaa Xaa Xaa 
220 

Xaa Xaa Xaa Xaa Xaa 
240 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 71: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Leu Trp Ser Ser Asn Asp Val Thr 
15 10 15 

Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly Ser Xaa Met Ser 
20 25 30 

He He Ala Thr Val Lys He Gly Pro Asp Glu He Ser Ala Met Arg 
35 40 45 

Ala Val Leu Asp Leu Phe Gly Lys Glu Phe fclu Asp He Pro Thr Tyr 
50 55 60 

Ser Asp Arg Gin Pro Thr Asn Glu Tyr Leu Ala Asn Leu Leu His Ser 
65 70 75 80 

Glu Thr Phe He Ala Leu Ala Ala Phe Asp Arg Gly Thr Ala He Gly 
85 90 95 
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Gly Leu Ala Xaa Xaa Ala Tyr Val Leu Pro Lys Phe Glu Gin Ala Arg 
100 105 110 

Ser Glu Xaa Xaa Xaa Xaa Xaa Xaa lie Tyr lie Tyr Asp Leu Ala Val 
115 120 125 

Ala Ser Ser His Arg Arg Leu Gly Val Ala Thr Ala Leu lie Ser His 
130 135 * 140 

Leu Lys Arg Xaa Val Ala Val Glu Leu Gly Ala Tyr Val lie Tyr Val 
145 150 155 160 

Gin Ala Asp Tyr Gly Xaa Xaa Xaa Xaa Asp Asp Pro Ala Val Ala Leu 
165 170 175 

Tyr Thr Lys Leu Gly Val Arg Glu Asp Val Met His Phe Asp He Asp 
180 185 190 

Pro Arg Thr Ala Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
195 200 205 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Leu Arg Ser Ser Asn Asp Val Thr 
15 10 15 

Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly Ser Ser Met Gly 
20 25 30 

He He Arg Thr Cys Arg Leu Gly Pro Asp Gin Val Lys Ser Met Arg 
35 40 45 

Ala Ala Leu Asp Leu Phe Gly Arg Glu Phe Gly Asp Val Ala Thr Tyr 
50 55 60 

Ser Gin His Gin Pro Asp Ser Asp Tyr Leu Gly Asn Leu Leu Arg Ser 
65 70 75 80 

Lys Thr Phe He Ala Leu Ala Ala Phe Asp Gin Glu Ala Val Val Gly 
85 90 95 

Ala Leu Ala Xaa Xaa Ala Tyr Val Leu Pro Lys Phe Glu Gin Ala Arg 
100 105 HO 
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Ser Glu Xaa Xaa Xaa Xaa Xaa Xaa lie Tyr lie Tyr Asp Leu Ala Val 
115 120 125 

Ser Gly Glu His Arg Arg Gin Gly lie Ala Thr Ala Leu lie Asn Leu 
130 135 140 

Leu Lys His Xaa Glu Ala Asn Ala Leu Gly Ala Tyr Val He Tyr Val 
145 150 155 160 

Gin Ala Asp Tyr Gly Xaa Xaa Xaa Xaa Asp Asp Pro Ala Val Ala Leu 
165 170 175 

Tyr Thr Lys Leu Gly He Arg Glu Glu Val Met His Phe Asp lie Asp 
180 185 190 

Pro Ser Thr Ala Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
195 200 205 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 

(2) INFORMATION FOR SEQ ID NO: 73: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 0 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Met Thr Thr Leu Asp Asp Thr Ala Tyr Arg Tyr Arg Thr Ser Val Pro 
15 10 15 

Gly Asp Ala Glu Ala lie Glu Ala Leu Asp Gly Ser Phe Thr Thr Asp 
20 25 30 

Thr Val Phe Arg Val Thr Ala Thr Gly Asp Gly Phe Thr Leu Arg Glu 
35 40 45 

Val Pro Val Asp Pro Pro Leu Thr Lys Val Xaa Xaa Phe Pro Asp Asp 
50 55 60 

Glu Ser Asp Asp Glu Ser Asp Asp Gly Glu Asp Gly Asp Pro Asp Ser 
65 70 75 80 

Arg Thr Phe Val Ala Tyr Gly Asp Xaa Xaa Xaa Xaa Xaa Xaa Asp Gly 
85 90 95 

Asp Leu Ala Xaa Xaa Gly Phe Val Val lie Ser Tyr Ser Ala Trp Asn 
100 105 110 

Arg Arg Xaa Xaa Xaa Xaa Xaa Xaa Leu Thr Val Glu Asp He Glu Val 
lie 120 125 
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Ala Pro Glu His Arg Gly 
130 



Ala Thr Glu Xaa 
145 

Glu Val Thr Asn 



Tyr Arg Arg Met 
180 

Asp Gly Thr Ala 
195 



Phe Ala 
150 

Val Xaa 
165 

Gly Phe 
Ser Asp 



Pro Cys Pro Xaa Xaa Xaa 
210 



Xaa Xaa Xaa Xaa 
225 



Xaa Xaa 
230 



His Gly 
135 

Gly Glu 



Xaa Xaa 



Thr Leu 



Gly Glu 
200 

Xaa Xaa 
215 

Xaa Xaa 
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Val Gly Arg 

Arg Gly Ala 
155 

Xaa Asn Ala 
170 

Cys Gly Leu 
185 

Arg Gin Ala 
Xaa Xaa Xaa 



Xaa Xaa Xaa 
235 



Ala Leu 
140 

Gly His 
Pro Ala 
Asp Thr 



Leu Tyr 
205 

Xaa Xaa 
220 

Xaa Xaa 



Met Gly Leu 



Leu Trp Leu 
160 

lie His Ala 
175 

Ala Leu Tyr 
190 

Met Ser Met 



Xaa Xaa Xaa 



Xaa Xaa Xaa 
240 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Met Thr Thr Thr His Gly Ser Thr Tyr Glu Phe Arg Ser Ala Arg Pro 
15 10 15 

Gly Asp Ala Glu Ala lie Glu Gly Leu Asp Gly Ser Phe Thr Thr Ser 
20 25 30 

Thr Val Phe Glu Val Asp Val Thr Gly Asp Gly Phe Ala Leu Arg Glu 
35 40 45 

Val Pro Ala Asp Pro Pro Leu Val Lys Val Xaa Xaa Phe Pro Asp Asp 
50 55 60 

Gly Gly Ser Asp Gly Glu Asp Gly Ala Glu Gly Glu Asp Ala Asp Ser 
65 70 75 80 

Arg Thr Phe Val Ala Val Gly Ala Xaa Xaa Xaa Xaa Xaa Xaa Asp Gly 
85 90 95 

Asp Leu Ala Xaa Xaa Gly Phe Ala Ala Val Ser Tyr Ser Ala Trp Asn 
100 105 HO 

Gin Arg Xaa Xaa Xaa Xaa Xaa Xaa Leu Thr lie Glu Asp He Glu Val 
115 120 125 

Ala Pro Gly His Arg Gly Lys Gly II Gly Arg Val Leu Met Arg His 
130 135 140 
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Ala Ala Asp Xaa 
145 

Glu Asn Thr Asn 



Tyr Arg Arg Met 
180 

Gin Gly Thr Ala 
195 

Pro Cys Pro Xaa 
210 

Xaa Xaa Xaa Xaa 
225 



Phe Ala Arg Glu Arg Gly Ala 
150 155 

Val Xaa Xaa Xaa Xaa Asn Ala 
165 170 

Gly Phe Ala Phe Cys Gly Leu 
185 

Ser Glu Gly Glu Xaa His Ala 
200 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
215 



Xaa Xaa 
230 



Xaa Xaa Xaa Xaa Xaa 
235 



Gly His Leu 
Pro Ala lie 



Asp Ser Ala 
190 

Leu Tyr Met 
205 

Xaa Xaa Xaa 
220 

Xaa Xaa Xaa 



Trp Leu 
160 

His Ala 
175 

Leu Tyr 
Ser Met 
Xaa Xaa 



Xaa Xaa 
240 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Lys lie Ser Val lie Pro Glu 
15 10 15 

Gin Val Ala Glu Thr Leu Asp Ala Xaa Glu Asn His Phe lie Val Arg 
20 25 30 

Glu Val Phe Asp Val His Leu Ser Asp Gin Gly Phe Glu Leu Ser Thr 
35 40 45 

Arg Ser Val Ser Pro Tyr Arg Lys Asp Tyr Xaa Xaa He Ser Asp Asp 
50 55 60 

Asp Ser Asp Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asp Ser 
65 70 75 80 

Ala Cys Tyr Gly Ala Phe Xaa He Xaa Xaa Xaa Xaa Xaa Xaa Asp Gin 
85 90 95 

Glu Leu Val Xaa Xaa Gly Lys He Glu Leu Asn Xaa Ser Thr Trp Asn 
100 105 110 

Asp Leu Xaa Xaa Xaa Xaa Xaa Xaa Ala Ser lie Glu His He Val Val 
115 120 125 

Ser His Thr His Arg Gly Lys Gly Val Ala His Ser Leu He Glu Phe 
130 135 140 

Ala Lys Lys Xaa Trp Ala Leu Ser Arg Gin Leu Leu Gly He Arg Leu 
145 150 155 160 
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Glu Thr Gin Thr Asn Xaa Xaa Xaa 
165 

Tyr Ala Lys Cys Gly Phe Thr Leu 
180 

Lys Thr Arg Pro Gin Val Ser Asn 
195 200 

Phe Ser Gly Ala Gin Asp Asp Ala 
210 215 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 
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Xaa Asn Val Pro Ala Cys Asn Leu 
170 175 

Gly Gly lie Asp Leu Phe Thr Tyr 
185 190 

Glu Thr Ala Met Tyr Trp Tyr Trp 
205 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
235 240 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met 
15 10 15 

Ala Lys Phe Lys lie Arg Pro Ala Thr Ala Ser Asp Cys Ser Xaa Xaa 
20 25 30 

Xaa Xaa Asp lie Leu Arg Leu lie Lys Glu Leu Ala Lys Tyr Glu Tyr 
35 40 45 

Met Glu Asp Gin Val lie Leu Thr Glu Lys Asp Leu Gin Glu Asp Gly 
50 55 60 

Phe Gly Glu His Pro Phe Tyr His Cys Leu Val Ala Glu Val Pro Lys 
65 70 75 80 

Glu His Trp Thr Pro Xaa Xaa Xaa Xaa Xaa Glu Gly His Ser He Val 
85 90 95 

Gly Phe Ala Xaa Xaa Met Tyr Tyr Phe Thr Tyr Asp Pro Trp He Gly 
100 105 110 

Lys Leu Xaa Xaa Xaa Xaa Xaa Xaa Leu Tyr Leu Glu Asp Phe Phe Val 
115 120 125 

Met Ser Asp Tyr Arg Gly Phe Gly He Gly Ser Glu He Leu Lys Asn 
130 135 140 

Leu Ser Gin Xaa Val Ala Met Lys Cys Arg Cys Ser Ser Met His Phe 
145 150 155 160 

Leu Val Ala Glu Trp Xaa Xaa Xaa Xaa Asn Glu Pro Ser He Asn Phe 
165 170 175 
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Tyr Lys Arg Arg Gly Ala Ser Asp Leu Ser Ser Glu Glu Gly Trp Xaa 
180 185 190 

Xaa Xaa Xaa Xaa Arg Leu Phe Lys lie Asp Lys Glu Tyr Leu Leu Lys 
195 200 205 

Met Ala Ala Glu Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 
{B> TYPE: amino acid 
(C) STRANDEDNESS : single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met 
15 10 15 

Ala Lys Phe Val lie Arg Pro Ala Thr Ala Ala Asp Cys Ser Xaa Xaa 
20 25 30 

Xaa Xaa Asp He Leu Arg Leu He Lys Glu Leu Ala Lys Tyr Glu Tyr 
35 40 45 

Met Glu Glu Gin Val He Leu Thr Glu Lys Asp Leu Leu Glu Asp Gly 
50 55 60 

Phe Gly Glu His Pro Phe Tyr His Cys Leu Val Ala Glu Val Pro Lys 
€5 70 75 80 

Glu His Trp Thr Pro Xaa Xaa Xaa Xaa Xaa Glu Gly His Ser He Val 
85 90 95 

Gly Phe Ala Xaa Xaa Met Tyr Tyr Phe Thr Tyr Asp Pro Trp He Gly 
100 105 110 

Lys Leu Xaa Xaa Xaa Xaa Xaa Xaa Leu Tyr Leu Glu Asp Phe Phe Val 
115 120 125 

Met Ser Asp Tyr Arg Gly Phe Gly lie Gly Ser Glu He Leu Lys Asn 
130 135 140 

Leu Ser Gin Xaa Val Ala Met Arg Cys Arg Cys Ser Ser Met His Phe 
145 150 155 160 

Leu Val Ala Glu Trp Xaa Xaa Xaa Xaa Asn Glu Pro Ser He Asn Phe 



225 



230 



235 



240 



165 



170 



175 



Tyr Lys 



Arg Arg 
180 



Gly Ala 



Ser 



Asp 



Leu Ser Ser Glu Glu Gly Trp Xaa 
185 190 
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Xaa Xaa Xaa Xaa Arg Leu Phe Lys lie Asp Lys Glu Tyr Leu Leu Lys 
195 200 205 

Met Ala Thr Glu Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met 
15 10 15 

Asn His Ala Gin Leu Arg Arg Val Thr Ala Glu Ser Phe Ala His Tyr 
20 25 30 

Arg His Gly Leu Ala Gin Leu Leu Phe Glu Thr Val His Gly Gly Xaa 
35 40 45 

Xaa Ala Ser Val Gly Phe Met Ala Asp Leu Asp Met Gin Gin Ala Tyr 
50 55 60 

Ala Trp Cys Asp Gly Leu Lys Ala Asp lie Ala Ala Gly Ser Leu Leu 
65 70 75 80 

Leu Trp Val Val Ala Xaa Xaa Xaa Xaa Xaa Glu Asp Asp Asn Val Leu 
85 90 95 

Ala Ser Ala Xaa Xaa Gin Leu Ser Leu Cys Gin Lys Pro Asn Gly Leu 
100 105 110 

Asn Arg Xaa Xaa Xaa Xaa Xaa Xaa Ala Glu Val Gin Lys Leu Met Val 
115 120 125 

Leu Pro Ser Ala Arg Gly Arg Gly Leu Gly Arg Gin Leu Met Asp Glu 
130 135 140 

Val Glu Gin Xaa Val Ala Val Lys His Lys Arg Gly Leu Leu His Leu 
145 150 155 160 

Asp Thr Glu Ala Xaa Xaa Xaa Xaa Xaa Gly Ser Val Ala Glu Ala Phe 
165 170 175 

Tyr Ser Ala Leu Ala Tyr Thr Arg Val Gly Glu Leu Pro Gly Tyr Cys 
180 185 190 

Ala Thr Pro Asp Gly Arg Leu His Pro Thr Ala lie Tyr Phe Lys Thr 
,195 200 205 
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Leu Gly Gin Pro Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 79: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
15 10 15 

Xaa Xaa Xaa Xaa Met Pro Asn Val • Thr lie Ala Arg Glu Ser Pro Leu 
20. 25 30 

Gin Asp Ala Val Val Gin Leu lie Glu Glu Leu Asp Arg Xaa Xaa Xaa 
35 40 45 

Xaa Xaa Xaa Xaa Xaa Tyr Leu Gly Asp Leu Tyr Pro Ala Glu Ser Asn 
50 55 60 

His Leu Xaa Xaa Xaa Leu Asp Leu Gin Thr Leu Ala Lys Pro Asp lie 
65 70 75 80 

Arg Phe Leu Val Ala Xaa Xaa Xaa Xaa Xaa Arg Arg Ser Gly Thr Val 
85 90 95 

Val Gly Cys Xaa Xaa Gly Ala He Ala He Asp Thr Glu Gly Gly Tyr 
100 105 110 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Glu Val Lys Arg Met Phe Val 
115 . 120 125 

Gin Pro Thr Ala Arg Gly Gly Gin He Gly Arg Arg Leu Leu Glu Arg 
130 135 140 

He Glu Asp Xaa Glu Ala Arg Ala Ala Gly Leu Ser Ala Leu Leu Leu 
145 150 155 160 

Glu Thr Gly Val Tyr Xaa Xaa Xaa Xaa Gin Ala Thr Arg He Ala Leu 
165 170 175 

Tyr Arg Lys Gin Gly Phe Ala Asp Arg Gly Pro Phe Gly Pro Tyr Gly 
180 185 190 

Pro Asp Pro Leu Ser Leu Phe Met Glu Lys Pro Leu Xaa Xaa Xaa Xaa 
195 200 205 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
210 215 220 
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Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Xaa Xaa Xaa Xaa Xaa Met Pro He Asn He Arg Arg Ala Thr Xaa He 
15 10 15 

Asn Asp He He Cys Met Gin Asn Ala Asn Leu His Asn Leu Pro Glu 
20 25 30 

Asn Tyr Met Met Lys Tyr Tyr Met Tyr His Thr Leu Ser Trp Pro Glu 
35 40 45 

Ala Ser Phe Val Ala Thr Thr Thr Thr Leu Asp Cys Glu Asp Ser Asp 
SO 55 60 

Glu Gin Asp Glu Asn Asp Lys Leu Glu Leu Thr Leu Asp Gly Thr Asn 
65 70 75 80 

Asp Gly Arg Thr He Lys Leu Asp Pro Thr Tyr Leu Ala Pro Gly Glu 
85 90 95 

Lys Leu Val Xaa Xaa Gly Tyr Val Leu Val Lys Met Asn Asp Asp Pro 
100 105 110 

Asp Gin Gin Asn Glu Pro Pro Asn Gly His He Thr Ser Leu Ser Val 
115 120 125 

Met Arg Thr Tyr Arg Arg Met Gly He Ala Glu Asn Leu Met Arg Gin 
130 135 140 

Ala Leu Phe Ala Leu Arg Glu Val His Gin Ala Glu Tyr Val Ser Leu 
145 150 155 160 

His Val Arg Gin Ser Xaa Xaa Xaa Xaa Asn Arg Ala Ala Leu His Leu 
165 170 175 

Tyr Arg Asp Thr Leu Ala Phe Glu Val Leu Ser Xaa Xaa Xaa Xaa He 
180 185 190 

Glu Lys Ser Tyr Tyr Gin Asp Gly Glu Asp Ala Tyr Ala Met Lys Lys 
195 200 205 

Val Leu Lys Leu Glu Glu Leu Gin lie Ser Asn Xaa Xaa Xaa Phe Thr 
210 215 220 

His Arg Arg Leu Lys Glu Asn Glu Glu Lys Leu Glu Asp Asp Leu Glu 
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225 



230 



235 



240 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



<xi} SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Met Glu lie Val Tyr Lys Pro Leu Asp lie Arg Asn Glu Glu Gin Phe 
15 10 15 

Ala Ser lie Lys Lys Leu lie Asp Ala Asp Leu Ser Glu Pro Tyr Ser 
20 25 30 

lie Tyr Val Tyr Arg Tyr Phe Leu Asn Gin Xaa Xaa Xaa Trp Pro Glu 
35 40 45 

Leu Thr Tyr lie Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
50 55 60 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Asp Asn Lys Ser 
65 70 75 80 

Gly Thr Pro Asn lie Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
85 90 95 

Xaa Xaa lie Xaa Xaa Gly Cys He Val Cys Lys Met Asp Xaa Xaa Xaa 
100 105 110 

Pro His Arg Asn Val Arg Leu Arg Gly Tyr He Gly Met Leu Ala Val 
115 120 125 

Glu Ser Thr Tyr Arg Gly His Gly He Ala Lys Lys Leu Val Glu He 
130 135 140 

Ala He Asp Lys Met Gin Arg Glu His Cys Asp Glu Xaa He Met Leu 
145 150 155 160 

Glu Thr Glu Val Glu Xaa Xaa Xaa Xaa Asn Ser Ala Ala Leu Asn Leu 
165 170 175 

Tyr Xaa Glu Gly Met Gly Phe He Arg Met Lys Xaa Xaa Xaa Xaa Arg 
180 185 190 

Met Phe Arg Tyr Tyr Leu Asn Glu Gly Asp Ala Phe Lys Leu Xaa Xaa 
195 200 205 

He Leu Pro Leu Thr Glu Lys Ser Cys Thr Arg Ser Thr Phe Leu Met 
210 215 220 

His Gly Arg Leu Ala Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 
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(2) INFORMATION FOR SEQ ID NO: 82: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE"; NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
1 5 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
20 25 30 

Met Asn Tyr Gin lie Val Asn He Ala Glu Cys Ser Asn Tyr Gin Leu 
35 40 45 

Glu Ala Ala Asn He Leu Thr Glu Ala Phe Asn Asp Leu Gly Asn Asn 
50 55 60 

Ser Trp Pro Asp Met Thr Ser Ala Thr Lys Glu Val Lys Glu Cys He 
65 70 75 80 

Glu Ser Pro Asn Leu Cys Phe Gly Leu Leu He Asn Asn Ser Leu Val 
85 90 95 

Gly Trp He Xaa Xaa Gly Leu Arg Pro Met Tyr Lys Glu Thr Trp Glu 
100 105 110 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu His Pro Leu Val Val 
115 120 125 

Arg Pro Asp Tyr Gin Asn Lys Gly He Gly Lys He Leu Leu Lys Glu 
130 135 140 

Leu Glu Asn Arg Xaa Ala Arg Glu Gin Gly He He Gly He Ala Leu 
145 150 155 160 

Glv Thr Asp Asp Glu Tyr Tyr Arg Thr Ser Leu Ser Leu He Thr He 
165 170 175 

Thr Glu Asp Asn He Phe Asp Ser He Lys Asn He Lys Asn He Asn 
180 185 190 

Lys His Pro Tyr Glu Phe Tyr Gin Lys Asn Gly Tyr Tyr He Val Gly 
195 200 205 

He He Pro Asn Ala Asn Gly Lys Asn Lys Pro Asp He Trp Met Trp 
210 215 220 

Lys Ser Leu He Lys Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 
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WHAT IS CLAIMED IS: 

1. An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequences selected 
from the group consisting of SEQUENCE ID NOS: 1 and 2. 

2. An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 3. 

3 . An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequences set forth 
in SEQUENCE ID NO: 4. 

4 . An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequence set forth 
in SEQUENCE ID NO: 5. 

5. An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 6. 

6. An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequence set forth 
in SEQUENCE ID NO: 7. 

7. An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 8. 

8. An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequence set forth 
in SEQUENCE ID NO: 9. 

9. An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 10. 

10. An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequence set forth 
in SEQUENCE ID NO: 11. 

11 . An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 12. 

12. An isolated nucleic acid sequence comprising 
the nucleic acid sequence encoding the sequences selected 
from the group consisting of SEQUENCE ID NO: 14 and 15. 

13. An isolated protein sequence comprising the 
amino acid sequence set forth in SEQUENCE ID NO: 16. 
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14 . A DNA sequence comprising a sequence 
complementary to an isolated nucleic acid sequence of claim 
1. 

15. A transformed plant cell comprising the 
nucleic acid sequence selected from the group consisting of 
SEQUENCE ID NOS: 1, 2, 4, 5, 7, 9, 11, 14, and 15. 

16 . A plant comprising a heterologous nucleic 
acid sequence selected from the group consisting of SEQ ID 
NOS: 1, 2, 4, 5, 7, 9, 11, 14, and 15. 

17. A DNA sequence comprising a sequence 
complementary to an isolated nucleic acid sequence selected 
from the group consisting of SEQ ID NOS: 1, 2, 4, 5, 7, 9, 
11, 14, and 15. 



WO 95/35318 



PCT/US95/07744 



taioHiv 
mozdn^d 



I2l9bqn 




<3 

IS 



CD 

o 

0- 



o 
o 



LJ ~ 

0 □ 



SUBSTITUTE SHEET (RULE 26) 



PCT/OS95WT744 

WO 95/35318 



2/5* 
EcoR I Allele Blot 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



3/31 



pgEE1.2 



EE E 

| 1.2kb | 4.3kb | 

E E 

I 5.5kb I 

/\ 

24 bp DELETION j kb 



COLUMBIA 



ein2-12 



FIG. 3 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



4/54 



< 




a 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCIYUS95/07744 



6/34 



< CO f- 



UOU O O < < O <hU H (h gc < O < 55 O U f-< I 

U H OCK p U«2 O HfaH U <KH < <H<Wl 

U H ft U O HWU U < CO H 0<0 < UOH 2 HJH < 

*<HO U H HCOCJ U Hfa< < 255 U < 0 < 0$ Ooo 

H OWH < UU3 r <_<55h 2 2XO y, HXU o oou U HW 
HlfcU *? 2ZU 2 UOH CJ OWO < < H O U #* O Uft< 

< O SZH < OW p < _ U ft H H UOU CD (tOldtfU CO U O 

H UOO O OQ< H 2 55 H <=< <EO U ODD U H CO 2 H < 05 

< h H O <« to rt^HJO b UGH < <HU H H U H HJH 
H H HSO 2 UO< < UftH O 2 « U < H J < W H H i4 H *i 

H o > < u pwn h^omh h hcou < u s < h <ho o 255 

U JO O p<U < p>0 O U H 255 H U < H O <C0U 

< < < PS H < OQH H OOU U U *3 H H < H U W U P! U U 

H OOO U 2 55 O U <SH H H CO < O 0>H U OOH < *t of 

H »J H H CO U U H CO U < rt! H U U 00< O < < O H >< U 

O U H J <C O <HO O OOO U H CO H U O O 2 CO CJ pOU H 

UftH 2 < tt O < < CO H U O < H < <HH O UftH U 

uou o 2 « o h^uoh a p < < < h n h h < p < p<o 

H gOO O <^S*? H H W p U UO< O U JHflJh 



U H pOO O < S < H HWf* CJ OQ< O U J H Oi H >« S < 

< *t M H < t& 2 2 O > CJ H CO H U 0<0 CJ < S H < OW 
H5h2 O CJ J 2 H O W H O H CO O < O < < CJ H CJ U OQO 

< 2 OOO CJ U iJ O h UO< < H H U O < O CO O H CO < 
(h 2«h CJ < H E-« *$ <MU H 2 W < O CJ ft O O < CO o o pw 

h b oft2 h 2i«:o fn tH^2 f-« oo< H o f-« cj *tosfH 
< o<o 2 <m2 < , o > q 2 h J o 2_<shoo o << a o 
" ■ * 2b«JH < cjoh - -- - 



o 



U X o 



U Of 



ODCJ O < h* < < 0>0 U CJ X H O HiJ< CJ O H O H >• H 
" — "* 255H < _ *< Eh < <c _ CO p 2 EhcoH**< oou h 



U H HSU U 255H 

< 0>U U rfHO < 
OQH < 0<2 < O 

< O OQ< O OWH 
U OOh U <0£< O 
0<tO O U HOT 



H < UOH p. 
U UXH O Hi-. . , 
< *<H< «C rtCOO 2 HC0H*t< OOL 

own e> oq2 h uo< h h>«h o 

OQp H *C«H U <EU O H i 

H §h b« u U H CO O < <C 0i H tu . 

p>b O <HH < USCO < 

H *£UPiU b uo< u 2 



p o 



H H < W O 2 U ft U U rtl 55 U O <HO % H CO O 52 



_ o > 

, < O O H 
H >* < U 
« O O p < 
O O O H 

. , . . _ _ _ , „ _ _ , . „ ! oo< o 

O <SU H OWO < U ft CJ U OOO H ow< < 2 « < o oo 
OOU O UJH H OQH O 0<H H <ZO U O O O 2 Pi H 
H H < CO O H H J < O rt CO H H <ZH O <HHW< 002 < 
U HO U *t H H 2 *t CO 2 U H O O HSO H IDUX< U 2 55 
Ho)2 C5 H » 2 < OWO CJ 0<0 < U Oi H U H H O H CO O 
< 2 OOU 2 OQO H H H H UOU O U ft O bu *£ O O *< O 
U UOH H OWH rf £ H O U J < < OOU O 2 55 rt 2 < Pi 

<h< h yjcg < 2«2 u u u o u 2 55 o h o u o uoo 

O O H fa < < H >« H U U ft < < HCOU < <mHOU pOH < 

<PJ< H UWU H UftH o UOH H OWO U U»tH U OU 

:0 H <Mrf H U^tf CJ OOH OX < U <Z O HC0O 

U 0>0 2 0>H 2 0<0 < OOO U OWHftH <«H U 

H CO O H OMO O pMH H OQO H H CO < U CD > 2 H C3 < 

H U J U ^ CO H o H J < H H J H O O < U <M< 

<£U 2 H C/J H H OrtH H SsS U UO<<U ^HWU H 

~ o uou eg o uj2 u hcoh u o<o < p> 

HUH O < to 2 2 00< O Uft< O U < U OQH 

u o w h (twh 2 2 w *C 2 oort u upirtftu UftH o 

SC t O<H pS>p H w 2WU 2 a«o D HWU B 0<H S OO 
-.>*H b H JH O , Q>P O, UOO < HtOrt! U p U rfl O O H 

h o>b b <h< u <Hp < 2u:o u hcohoo ooh h 
Iwh u g«JH a^ortf^ F O Uft($ b OOO J ^* 





SU 0>H H O<0 U 0<< U UOUftH H St L 
^uftH u <s< o o < h u o«« y o>u u uo 

U < H U H CO O 2 UKO U H to H H < H < <HO 
i H < H CO U 2 Bm< O 2HO 5 H J 2 H U H 3 S 

" -i!ho o 2«rt| u rt«2 u uoib h <Hrf! u pea 

y r ft H 2 2. HWU U HCOU U U H D H to H 
<HH U O M H U UftH H 0<HJ< OOU H 

H W H U 5hS C o<< B CjS S OOP U Ctu 
_.ito< y, H to 2 cj ua< u oou 2 o h o ua«u 

J>0 H <HU 2 <HU 2 HW< O OOQHOU it CO < U 
H UftH U < 8 U 2 O 3 W O 3 < CO H 2 < H B H U ft 

8«S H g &>8 < B 3-g > 8 Bo^S S 0 B M § S S z l 8o§ a 

H H H*4H < HJ2 H HSCH H HU< *< OS U O O 2 55 O O 

H O > O H 2 55 O H rfurf U 0>H H H >* 2 U pOU H < Pt 

UJO O U^2 < 0>0 U <HO < HiJrt; 2 H H < o>u 

H U OOp 2 UOH H H CO O H H >• < O O W O W U 2 55 O U 

H H to O H UOH bUH O H »3 < O 2 BJ H to 2 O H CO 

dMU H U J H <C OOH H oo< O oo2 2 2 H U U PC O 

2 O U *J H U OOO < 0>H 2 pOH O 2ttO&£U U ft «* p 

O HUH U 0<0 H OOO O UOH < < Oi 2 < H to < U H S 

< OJ H H «t H H HiJO o _ < to g o H>«o u u_«c H Htoi 

U U HfeU < 2mo < oo2 o <t to o < o<«<ou HJO 

H U ft 4 U 002 < 2^< O OOU H OUO 2 UftH o 

<HH O <HO H OWH 2 OOH O H »4 H H 2 « H < OOO 
2 H OOU H OX < OWO < < O) O O HJO H UKO U 
U UJO < <SO O pOH O UKH U < CO H H p>p H H to 
" ■ rtOiH H OOH H HCOO < H»4H H H»" J 



<X H O 



H >« U 



O 



U <SH < 255 U 2 rt! CO H 2 H tu U < H J U 
0<0 H UOO O OMU < OWH < OQ< < 
H H HCuH O < CO O H OOO #* OOU 2 O 
O <SO H HS2 < 0>0 O O Q *t H O W < 
u 05 2 U <M< 2 OWO < OOH 2 <mU H 
H U *£ H O H OW< < 2 H 2 « H < 0>H 

< O < 2 U *C H o u pw2 O H >< 2 O H >* H u 
UKO U H CO 2 H UftH 2 < W < O < C0 H < 0< 
H <Hb 2 < ow2 H uo2 U H >« H 

O > H U UOO U 2 55 H H H J H O O < < H 



< O O 
2 2 U 



U p O U 

O < U < 

OQ< < 255H U 

J " fn O U < H 



O H to O 



H <C H H 
UJU u 



u a o 

O H 

p H J < _ 

HSU U O O < 

H H O < O 2 



U O > U 
O < O H 
U < U J H 



U UftH H 
U Pi U H < M 
O < O O O > ' 
H < CO O 
O > O H 



O U < 



Q > O 



2SO H HtO< < 
< M H U 2 



OOH 
H ft O 

WO U 2 

U UftH 2 
Ht0< 2^0 

o ooh u 2zp rt 

UPiO < <HH H UO 
ttU O OOO < < E < 



H o u : 
H < co < 
H U* U O 



H 2 < O O 

H >* U H >« < 

U O < < U 

H U H 2 H 

U U«J<C0U H J 2 

HC0< O HCOO U 

55 o 2 ^ o u 2h 
- - a; h H co 2 
*t s *t u ^ 

H U O < 
W U H co H 
H co < O 



U O* 



m 

H 
Pi 
D 
O 
H 
ft 



SUBSTITUTE SHEET (RULE 26) 



WO 95735318 PCT/US95/07744 



7/J* 



< o o u 

8 CO O H 

H O > 

U O > H 

H CO O O 

< < U a 

o u ou 
o o o < 

g-H" 

i J 5 So 

2 oho 

2 < O 
H O H Z 
H O Ci H 

O 2 O O 

rf o a H 

3*0* p.. 

H O < CJ 
UJ< o 

h a < co 

H O « O 

< M H H 

H «t O > 

O H >« H 
H U U H 
U H < m 

< < H O 
U«h H 

O H H J 
CD H ft* h 

< ta o < 

CJ < O Q 
H < « H 
HfcD U 

8< U < 
O fid *t 
H CO < CJ 
O O < « 

< u os o 
u o< o 

H U H S 

H O *< H 
U Jt 1 H 

BCJ O > 

o < u 

o h p u 

< . O > H 

H © DO 

u u a H 

0 < o *< 

D O O 

< O O < 
© « H O 

g SmS° 

s^s So 

1 " 

ew H u 
H H W 

C 8c£ w 

O < H h 
U © CJ *J 

5 *, 5 w 
u a u a 
,< u oo 

f O Q < < 

< © OH 
H pOO 

< M H O 

H H O O 

g-rs« 

ft H J U 

< w O < 

o u u ac 

U H S h 

u o* o o 



• u j o 
: m h U 

p o 



\ OH 

: c. 



u a 



p tt H 

i H H , 



! 

s 

I 




»4 H U a « < 

• D © < o u 

!b. O H O O C 

! 6 > S-^P 3 
; u g g*6 M i: 

I O « © H O 

> a u < <mo 

hS h S SojG 
i u a ^ < o 

• < H o H H 

; s: u o o > b 

! O OOO H 
» H U U O U 

2 2 K Q Eh 

i a « a < o 

: H U U O DO 

I O UCbH H 

i O O U O H 

l « H U Ctf < 

i a no* " 
a « h 

> J H U 

> U O < 

: o < o 

* © H 

! u o > o 

I W < < a « Eh M 

' a O Q U tJ H 

a W H Eh Eh CO O 

i (i* a O HbU p 

i U O O O < H 

p < < O a 55 H 

1 ^ ^ p <u a < o 

o<P u uo3u 

fiuo U <HEh hh 
U O < O < HO 

ucu< p^azuu 

D H U HSU " * 
CJ < H < < 

u o* h o 3 
ih < o o o o 

£ J 8 ° 8 v a* j 

> O U H CO O 

SsS H a a«HH 

l a O OHO U < 
0_U«H O DU 
O O U < OOQH 
CJ O H >4 < H < 

C«<a H HC0<O 

< o q > o 3h 

> U < C4 < U O < 
<HH* 2 o > Fh 

u h co h a pa 

CJ 0- O a^HH 
H H a Z U HO 
O < w O H p H 

O O H CO O H H 
U tD<< H HH 
0<0 H <EHH 

< H^H^U HO 
O < £ C H < O 

<o:a o ujoo 
o < u a h o a 



< _p w h 

KH O 



U H U « (. 

< o > < u 

OKU 2 < « « 

cj ct a « o 

O O D H H 

< cog g ^ O > J 

o<o O 

< H U Oi U 2 H 

o < h o < ob 
cj&a < a»ia;o 

h h a«o as 



CO 

u 

0 



M 

0) 
4J 

o 

tl) 
u 

<u 
a 

0) 
17} 



u 
in 

m 
in 



in 

co 
O 

0) 

-H 



o 
in 

H 

O 
H 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



METHIONINE 



S-ADENOSYLMETHIONINE 



(AVG) 



etol 



1 -AMINOCYCLOPROPANE— 1 -CARBOXYLATE 



(A IB) 



(Ag + ) 



ETHYLENE 



(trans-Cyclo- 
octene) 



RECEPTOR 
RECEPTOR COMPLEX 



ein1 
ein2 



his1 



RESPONSE 



FIG. 6 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



9/3A- 



FIG. 7A 




SHOOT APEX (STEM) 

EP1COTYL (STEM) 

HYPOCOTYL (STEM) 

RADICLE (ROOT) 
ROOT APEX 



COTYLEDONS 



FOL I AGE LEAF 



COTYLEDONS 



FIG. 7B 




EPiCOTYL 



HYPOCOTYL 



ROOT 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCI7US95/07744 



pileup.msf(eill) 
pi leup.msFfel 13) 
pileup.msf(eil2) 
pileup.msf(eil3) 
Consensus 

pi leup.msf(eil I) 
pi leup.msf (ei 13) 
pileup.msf(eil2) 
pi leup.msf (ei 13) 
Consensus 

pileup.msf(eill) 
pileup.msf(ei!3) 
pi leup.msf (ei 12) 
pi leup.msf (ei 13) 
Consensus 

pileup.msf(eill) 
pi leup.msf (ei 13) 
pi leup.msf (ei 12) 
pi leup.msf (ei 13) 
Consensus 

pi leup.msf (ei 1 1 ) 
pi leup.msf (ei 13) 
pileup.msf(eil2) 
pi leup.msf (ei 13) 
Consensus 

pi leup.msf (e i 1 1 ) 
pi leup.msf (ei 13) 
pi leup.msf (ei 12) 
pi leup.msf (ei 13) 
Consensus 

pi leup.msf(ei 1 1) 
pi leup.msf (ei 13) 
pi leup.msf (ei 12) 
pi leup.msf (ei 13) 
Consensus 



1 0/34 

1 ' 50 

..hhhmMMFM EMGMYGNMDF FSSs.JolD vCPIPQoEqE pVVeDVDYtD 
iiittUMFN EMGMCGNMDF FSSgSLgEVD fCPvPQoEpD olVED.DYtD 
. .dsmdMynN niGMFrsLvc sSoppFTEgh MCs...dsht olcDDIs.sO 

mg DLoM SvoOIr MenePddlos dnVoEIDvoD 

M 0 

51 100 
DEmDVDELEk RMWRDKMRLK RLKEOQsKcK EGVDgsKQRO SW. .EOARRK 
OEiDVDELEr RMWRDKMRLK RLKEQd.KGK EGVOooKQRO SQ..EQARRK 
EEmElEELEk IciWRDKqRLK RLKEmoKnGl gtrlllKQqh ddfpEhsskr 
EEiDoDDLEr RMWkOrvRLK RiKErQKoGs qGoqt.Ketp kkisDQAqRK 
-E LE- -W-O-RLK R-KE K 

101 



150 

SDNLREWWKD 
SDNLREWWKD 
SDNLREWWKO 
SDNiRoWWKE 
SDN-R-WWK- 



KMSRAQDGIL KYMLKMMEYC KA0GFVYG1 I PEkGKPVTGo 
KMSRAQDG 1 L KYMLKMMEVC KAOGFVYGI I PEnGKPVTGo 
tMykoQDGIL KYMsKtMErY KAQGRVYGIV lEnGKtVoGs 
KMSRAQDG I L KYMLKLMEVC KvrGFVYGI I PEkGKPVoGs 
-M— AQDGIL KYM-K-ME-- K— GFVYGI- -E-GK-V-G- 

151 200 
KVRFORNGPA AlAKYQsENN ISGGSnDcNs IVGPTPHTLO ELQDTTLGSL 
KVRFDRNGPA AltKYQoENN Ip.GihEGNN pIGPTPHTLQ ELQDTTLGSL 
KVRFORNGPA AliKhQrDiN ISdGSDsGse vgdsToqkLI ELQDTTLGoL 
KVkFDkNGPA AIAKYeeEcl ofGkSOgnrN ....sqfvLQ DLQDoTLGSL 
KV-FD-NGPA AI-K L- -LQD-TLG-L 



201 

LSALMQHCOP PQRRFPLEKG VsPfWnGn 
LSALUQHCDP PQRRFPLEKG VPPPWMPnGk 
LSALfpHCnP PQRRFPLEKG VtPPWWPtGk 
LSsLMQHCOP PQRkYPLEKG tPPPWWPtGn 
LS-L-HC-P PQR--PLEKG --PPWWp-G- 

251 

KKPHDLKKoW KVGVLTAVIK HMsPDIAKIR 
KKPHDLKKoW KVGVLTAVIK HMFPDIAKIR 
KKPHDLKK1W KIGVLigVlr HMosOlsnlp 
rKPHDLKKmW KVGVLTAVln HMLPDIAKIk 
-KPHDLKK-W K-GVL — VI- HM--DI-I- 



250 

EEWWPQLGLP nE..QGPPPY 
EDUWPQLGLP KD..QGPoPY 
EDWWdQLsLP vDfrgvPPPY 
EEVW/vkLGLP Ks...qsPPY 
E-WW-L-LP PY 

300 

DKMTAKESAT 
DKMTAKESAT 
EKMTsrEgAI 
OKMTAKESAi 
-KMT— E-A- 



KLVRQSKCLQ 
KLVRQSKCLQ 
nLVRrSrsLQ 
rhVRWSKCLQ 
--VR-S--LQ 



350 



301 

WLAliNQEEv voReLYPES CPPLSs SssIGSgSLL iNDCSEYDVE 

WLAIiNQEES loReLYPES CPPLSL Sg..GScSLL mNDCSqYOVE 

WLAolyrEko ivdq ioM SrennntSnF i vpotggDpD 

WLAVINQEES liqqpssOng nsnvtethrr gnnodrrkpv vNsdSDYDVD 
WLA E - - " D" 



F1G. 8 



FIG. 8A 



FIG. 


8A 


FIG. 


8B 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



p i I eup.msf (ei 1 1 ) 
pileup.msf(eiI3) 
piteup.msf(eil2) 
pi I eup.msf (ei 13) 
Consensus 



p i 1 eup.msf (ei 1 1 ) 
pi I eup.msf (ei 13) 
pi leup.rnsf (ei 12) 
pi 1 eup.msf (e i 1 3) 
Consensus 



pileup.msf(ei M) 
p i I eup.msf (ei 1 3) 
p i I eup.msf (ei 12) 
pi 1 eup.msf (ei 13) 
Consensus 



pi leup.rnsf (ei II) 
p i 1 eup.msf (e i 13) 
p i 1 eup.msf (ei 12) 
pi leup.rnsf (ei 13) 
Consensus 



pi leup.rnsf (ei 1 1) 
pileup.msf(ei!3) 
pi leup.rnsf (ei 12) 
pi I eup.msf (ei 13) 
Consensus 



pi 1 eup.msf (ei 1 1 ) 
pi I eup.msf (ei 13) 
pi leup.rnsf (ei 12) 
pi leup.rnsf (ei 13) 
Consensus 



pileup.msf(eill) 
pi leup.rnsf (ei 13) 
pi leup.rnsf (ei 12) 
p i I eup.msf (ei 13) 
Consensus 



GFEKEqHgFO VEErKPEiVM nhpLosfgVA KMQhFPIKEE VottvNIEFT 
GFEKESH.YE VEEIKPEkVM nssnfGm.VA KMhdFPVKEE Vpog.NsEFm 

vUpEstdYD VE LiGgthr tnQqYP...E fennyNcvYk 

GtEeoSgsvs skDsrrnql q KeQptolshs VrdqdkoEkh 



401 450 
RKRKqNnDMN vmVMDRSogY TCENgqCPHS kmnLGFqDRs SRDNHQMvCP 
RKRKpNRDLN LIMDR.TvF TCENIgCoHS eisrGFLDRN SRDNHQLaCP 

RKfeedfgMp m hpTIL TCENslCPyS QphMGFLORN I RENHQMICP 

RrRKrpR iRSgtv nrqeeeqPeo QqrniLpDmN hvDoplLeYn 

R D - 

451 500 

YROnRLoYGA ..SkFHMGgm KIVV...pqq PV QPI DLsGVgVPEn 

hRDsRLpYGA opSrFHvnev KpVVgFpqPr PVNsvo.QPI DLTGI.VPED 

YkvTsF yqpT.kPy gMTGIMVP. . 

ingThqeddv vdpnioLGpe dngleLvvPe fnNnyTylPI vneqtMmPvD 
P P~ 

501 550 
GQKMItELmo MYDRnVQS. . ..nQTpptLM ENQSmvidak oaqNqQInFn 
GOKMlsELms MYDRnVQS. . ..nQT.amvM ENQSvslLqP tvhNhQehLq 

....cpDyng H.qqqVQS.. ..fQdqf... .NhpnDlyrP kopqr 

erpMlygpnp nqElqfgSgy nfynpsavFv hNQedDiLht qie 

S N 

551 600 

SGNQm Fmq 

fpgnmvegsf fedlnipnro NnnnsSnNQt Ffqgnnnnnn vFkFdtoDhn 

GNdd Lved 

m NtqapphNog Feeopggvlq pLgLlgnEdg 

N 

60! 650 

qgtN nGVNNRFOMV FDSTpFDMAo FDYRODWqlG amEgmGkqqq 

nfeoahNnnN nssgNRFQLV FDSTpFDMAs FDYRDOmSmp Gv..VGTmdg 

LNpsp stlNqrlglV L.pTdFn G GeEtVGTenn 

vtgseLpqyq sGllspL TdlDfdy ggFgDDFSwr Go 

651 664 
qQQQQQDVSI W... 
MQQkQQDVSI W... 
LhnQgQElpt swiq 



FIG. 8B 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCTAJS95/07744 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



13/3* 



Ecker RI»s 



Chr Three 



Chi-sauare Stats, 99% Limit 
W x * a32 nga!72 



11.7 



8.5 
8.6 

9.1 
6.4 



: = : m583 g452 3 pyQpuFGL ngal26 

• - • Z&228 

• 4 2488a 

atbchib 

1£>J18E10I* nga!62 
44 g4708 



5*! f+pyuWAii/^xtos 

* m * ---g6220 
« - - m!05 



16.0 



14.1 



14.0 
6.6 



7.4 
10.8 
8.6 



. - - g4711 

m433 



• - ■ g2440 
xnsn37E7 



■--ID249 g4117 
. - . g4564b 
. - - g4014 

•--m457 



• - - g2778 



n424 

+4 ARA578_19/IMB78 



: z : g5966 



nga!12 



nga6 



FIGURE 10 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



Ecker RI's 



Chr Two 



Chi-sguare Stats, 99% Limit 
' L6C1 



8.8 

7.7 M 
8.2 

12.1 

11.9 

11.7 
6.7 



16.1 

8.5 
9.5 
6.4 

6.3 
7.9 



10.7 



xx246 
g4553 

g4532 
g4133 

m216 

©251 



.-.g6842 

: = : lfshp er fefi^ 



m283 
XH220 

4-]-m323 gl7288 

- - - ngal68 

d.2 

- - - g4514 

- - - n336 



- - - lyUP21A!2L 



FIGURE 11 



SUBSTITUTE SHEET (RULE 261 



WO 95/35318 



PCT/US95/07744 



Bcker RI's 

Chr Five 

Cbi-souare Stats, 99% Limit 
»pAtT80 



17.2 
10.3* 



8.7 



! = = nga225 g 3837 



12*8 



11.6 
24.1 



^£ nga!58 u2r5 
ca72 

'# g4560 ngal06 

: : : 0291 

.« - - nga!39 
« - ■ g4715b 



14.5 



26.8 
15.4 



14.7 
17.4 
10.8 



CIIid94 

l^ 121 03715 pyUP2GHL jaaSGSL ^47/CIRl 
m217 



XQ562 



CHS/g6833/tt4 nga249 agal51 



• - • nga76 



PHYC 

'44 ryUP12F9IA ^o^o 
- - - g4028 



m247 



--■ m435 



- - - m331 

nga!29 



* - - TSB1 
I 

j* - - 02368 

* - - m555 



FIGURE 12 



SUBSTITUTE SHEET (RULE 261 



WO 95/35318 



PCT/US95/07744 



16/34- 



Ecker RI*s 



Chr Four 



Chi-sguaxe Stats, 99% Limit 
g3843 



8*4 
6.4 
6.8 



92616 s606 
ngal2 
4-f ngaB HV4 



13.7 
5.5 



• - - pyUP17E10L 
- - - pyOP12F9LB 



4.3 
7.3 



« - - pEG4E2L 
CITd23 



14.5 



6.6 
17.2 
7.6 



5.6 
11.7 



13.1 



IQ51B 



|;;g6837 gl0086 g 4564a 
: = :iti226 g38 4 5 



xn600 



• - - g8300 
« - - g3088 

. - - CTTd76 C2M99 



HLS1 
++g3713 



FIGURE 13 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/DS95/07744 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95A)7744 



o — 

CO u 



cr co in tn in in ^ _x — — c to 

> JC L "CDU CL Cl — TD — ' JC 



O — » CL O 

CT_* TD TD 

* U C C Cn QJ 

CT» ^— ' O CX*TD Cn 

CO "O Q> QJ 

CL QJ X >s CT D 

>n QJ 0) TD TD CD 

<-> V) C CO "D TD 



V) 



> w a di c 

o> Cn JC TD TD 
o o o — > 

> o — • — > 

a> 



cr— » "o to o 
> a a cu cn 

UJ C- CT CfQ O LU 

O cr ( CO CO Q 

t_ cltd cr to cn (o 

> * CO t/) QJ CDT3 
Q_ C >s XT) TD TD 
QJ — -j^TJXI'O 
3: CO Q. O CL CL CO 
o_ cr — > c- ti_ — 



j: crx> -d 
— — - CD cn 



0 0*0 

.C _c o 

>x >n j£ -D -X 
U_ U_ "O 

CL cl cn • c 
££Q ♦ UJ 
Ul U U • o 

cn cn £ — cr 
c*_ o jz <u 

Cn Cn X C "D 
TD TD O CO CO 

a> qj cr qj td 
cr — cr o <l> 



Cl o 



> a 
—» QJ 

JC JC 

— > 

U QJ 

cn jc 

L. — > 

"I 8 

Cn 

> E 



Q> • 
— » "D 

cn c 
E u. 

LU O 

o a> 
> ~* 



CO 

c 
c 

cn 



• TD TD TD 
U_^U-U_ > > >- JC JC I I | 

( * jc c_ --j > jc — i — i o cn-*-) 

c c cn cn t_ E — — » 

<U jc U. U» CL CL>- — — U- >- — » 

ou a a a > > cn *ooe — 

x — TD*U"U*UtOCrcr> • > c 

i_j i > o > td qj co • u. >- >- o 

•oa>oaat/)QjQ)o . to — > co o 

0>sOO>>L-EE • • O — CL CD 
L L QJ D*J >n>s • * QJ G) CO - - 

LQ.EEt_utoa>a)cn . cl cljz cr 
^ o co >->- cn • ^: ^ cr >- 

0>(fl^-JOD^^£LW • C 

TD _C • — >U_U-C_0 0>*0— • l_ CO 

t_ C qj cr cn cn cn— — —> — — > • (j o 

O UJ O O O O CTUJ UJ UJ LU JZ CT— Lul 

£ o a a oi cn*D jc jc o« a> >> c cl o 



> > 

I o 

t_ -C 

>■ — 

> QJ 

CO 
QJ 
CO 



cn cn— j to • — E — TD — 

— I O > _J I I i_j>~u. cn c 

jc c~ •*-» ~cj _c e c_ cr trx x — > 

> <-)>>> O > JC L 

— > t- «J"0 > E >v-o cr 

O L t*- u c T3 TJ cn o E > >s 

> > > . . xi TD >n >><— C 

QJ • • t_ CT C ♦ E 

CO CTTD W L ■ • — QJ CO CO 

EE— » — » > • *r aa 



TD QJ — — — * 



— JC * CO — * — 

c_ — CO cnCk.ce. 

> — » cn cn co co xz 
jC QJ cn cn cn cn c 

QL-zf *0 "D QJ 

> E ^ • 

C • — * — » o cn o 

CO » JC JC QJ QJ "O 

U * CL Q. ' — — 

>- ■ U L> O O — » 



CO CO 
U CJ 
TD TD 
CO O 
O O 



O CO — 
t— QJ C 

CO c c 

QJ O 
O — 



cl^: 
qj cn 
CO CO 

— Cl 
C "TD cn 



cn 
E 



o o > 
a a l 



D O > 
> C "O Q> 

c cr— u 



CO CO QJ QJ QJ > 

cn cn o a o u. e. 
CT CTTD TJ > ^ 

cr cr cn cn cr o p 
—> Cl Cl qj E E 
> > > c_ Cl 

T3 "O (0 O — 
C C^J (I) > 
co to t- U CO 

CO C0 XCi- — 



£ C_ C_ QJ 

rr — >n E 
E E O -J 
— » CO 

*o cn 

TD £ 



cr 
o 



E E 



— QJ 

— — > 
TJ WD 

COO) 

— c— > 
• cr cn 

— » qj > 
o a> — 
c c *o 

C— c c 



C TD CL 

— — x> 
a a >s 

fc -2* QJ 

• >s t- 

• > > 

• — > 

• Q) —» 

• E E 



^-o > > — I — I I u_ 

OOE > >xXS S c.^ 

— t*- — E 

L.c_>>^zjccnco 

-C L_ >n cn cn — tO 

. oooo — u 

_c ooooo a l 
cr cn L l l u 

-^C0QJOQJQJC0_V 
L C7 > C OIL — E 

CT<C < <C <C < <C <C 



C--C L. _C CU"D-^ CT 

-— — — * o to 
O — I ^ _J oo o_j 

O O.C XI t— c 

— > CT CO C Cn U QJ _^ 

—I — ' S S — —I 

O 0«J^J_J_J^J — 

QJCroOO>C0QJ 

't: W -J -J L I CLO 

E — O OOCJJ OO 

> — >. — > — 

^ cr — 0~_c ^ e_ 
O CD t- L O O O O 

cr crcc: oc cc or ctr 

J I IX Z X I X 
JC CO CO QJ QJ On— ' T3 



ID 



o 
ll_ 



< 


m 


LO 


LD 






O 


o 







O" qj to cn o. cl _C 

C7>X3 O CO O O CO 

— — > > >. > > 

tO % O O QJ QJ > 

>- >- _J _J . 

cn cn*o tj *D D r 

>x X <U QJ QJ 

>N QJ — — > — 

O O >- >- -^J CO 
O — » — • O 



to 



QJ 



CO 

cn 
c 
> - 
> 
c 

CO 



JC 
c 



CL 
QJ 



L 0) QJ L L 

c co co c cr-o JC 

• t- L. C c c cn 

o O * S £ — 



c w crco o-) j 

O QJ QJ CO CO CO CL 

> > <— U_ >s >s • TD 

cn cn jc jc to to c >^ 

— — a. cl — > — — ' 

— J J J > O QJ Li_ 
QJ QJ >- > >■ O — >^ 

JC"D>->-U-LuJ^>- 
QJQJOOOOOE 

"O JC 



CL 



■ < < «< <c > <c 



— OlOTJT) QJ Cn 
cn E Cn cn cn cn cr > 

c*- c_ — > "a "D TJ 

>s E O > * • • to 

«<_ -^j o * • • _C 

o o cn qj - cn 

co >»» t— cr • • qj 



cn cnxs "D • • • 
cr c c_ i>- . • . 
jc cr o a td o — 
-c .c o o cn cn • 

>N > C- 

oj e<<c<c<<: cl 

C ..— .— >> cn— > 
0»— > U_ U_ U_ U_ >- ^ 
CO E C — » — • -J O XZ 
— cn cn qj jc L_ t_ O qj 



o o 



E E 
c t_ 




QJ O QJ CL — 
CO O L. QJ O 
CL • *J L U 
UJ GO — > • 
CO UJ 

—I • • • • 

o — * — . . . . 

c O —> —> —j 

c o o o 

22 C (0 W 



QJ C 

CO o 

3 E 

O 3 



o o 

to CO 
to CO 



o — —»—>.— —» c 

TD CL CO (/IX)— QJ 

r> to o o o o co 

QJ O QJ QJ L_ * * C 

to ^ >->-<: ^ o 
cl. <: o» o 

.... . . to 

. . . . ^ ro — 

X> — » "O ^ to O 
o o c <; ^ o 
*j — o s n: o 



to 00 ~~ 

O QJ CO 

E U QJ 

O >s CJ 

E E >s 

o — o E 

XJ — — > o 

3 — J — QJ 

O O QJ O QJ CL — CO 
UUCOUL-QJOD 

. • a. * » t_ o o 
UJ UJ UJ CO — > • zs 
to UJ 

~2 — o — ► • — • • « * — > 
EEco— »— »— »o 

— — ro C O O o to 
LLZ2CIA(/)I/} 



SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



u 1 _j _ J -J — -J -J 

j= _c — on e >> o » 

E > — (/> — » 

co — o • >s cn r 

(/) cn in u (U d) — i 

O (- — O X> O — I 

u x: o x: x: c cr i 

i_ o > o> • oj i 

£ > L_ QJ C- CT t- I 



< < <: 

> > - 



< t_ g- c_ < 

oj — 6 <— ■ 

• • • o -ac * t- 

cr crx> x) cu c 

co s, _ 1; ^ 

— I > — O O 3 _J 

c tu t_ cr— E o> 

^ TD 0) L, CD 

_J =E _J =£ > > — I 



I 
I 
I 
I 



t/) L_ l_ 



C — I 

W J£ U. ^ I 
OOOOO 



o o 0*0 OOOO 

OOO L. O t- CO 

0:0:0:0:0:0: crcc 

X) Ifi —»-•-»-•-» t— TJ I 



x> 

CU 



a a l w aa 1 
— cr E a) w l 1 

E ^ W O L > 1 
-J E — I — I — f _J _J 
^ L. (/) E CTia I 

cr -j 01 — _c i 

> > >. 

oj <ur >->- * >- 

O cn CT> Cn O • I 



C t_ 
CL — 
CL L. 
OJ > 
C C 

cr t_ 

— t_ • crx: 
C • X> CL 

cn — >h q_ • 

— cn cn-Q • 



1 
1 
1 

1 

t 
l 
1 

0) I 
* 1 



£ C cnx> • • 1 

CL CL QJ C "O * CD I 

X) — » E E — I 

>s CTT3 >s I 

— » O — > U — » E I 

u_ o —i > >- a. -J 
>% to — > — — u >■ 

>r — j o >- u q 1 >- 

E cro 00^00 
• 1 



<:<:<->> — 
u. to cn_j 
cto > jt: 
> — > a) 
•— > — > Cn 
CO C Cn a_ 
XT X> W O 
cn-o u — 

QJ 0) L X 



Cl 

x> 



' CL "O 

cl<: <: — • 

> > c 

> > -J L. Cl CT 
-C i e_ cn-J cn 
<L) — s_ X) Cn u 



> • - < 

> ^ — I 
X) cn 1 
c > 1 
cr— 1 
co 10 1 

- c 1 
• c 1 
. — 1 

— 1 



cn 1 
<— 1 
o 1 
— < 

C I 
CL Li- 
CO I 
a> 1 



I9/Jf 



o 

CNI 



L. 

cn 

CL 



■o to 

CL"D 



O O 
— » >- 



> 

x: 



X5 no 

Sz "° 
5= >- 
cr o 
cn-o 



o 
*o 

cr a> a> — » 
o a) a) a 
CL CL Cn o — » cr 

cn 




•to • XJ 
C L. — > CL 
CO — » 0) _X 

— U O C 

cr co X) ^ 

— > cn 

03 03 CL C 

a> — > a) o 

^ CL-* CL 



CJ CJ CO o o _ 

CL CLC- EE — ■ > — > — 

E E i — ' • • — ' Cn 

to to — — ■ * > > 

3 3 S _J — I U- • IE -J t- — —J 

>->->->->->- • O L_ >- >- 

j j E id qj — ,_t >- u- to >- 

000-^-^OClOO>CT> 
crx: *» "D "D ^ "D "D c c 
i_ .a> — — a u a) cn > ^ 

a; a) c .C E encue cr 
encntou-u- — c_ X) c x: >s 
TJ (u > -i -J ( 1 cr^j o l. . 

10 tO tj L (_ CTtO >^>s>sCU 



OOOO 

mm) *j l» 
C co cn cn— » 
Cl CLX) CT_* 
"O >k >h >s 

— » * 



• X) >x >v > 

• CL CL CO L> CL CL 

• *j "o c x: 

• o ao E >^ 

. 0 cn— C_ — C 

* >v >s • - — — 



X5 crx> X3 o o c_ 
^ ^ c_ c- — * co — 

O — -C x: TD X) X* 
>s O g: E — — — 
OO > > O O O 



cn cn cn cl • - co c 

0> « CL Cn • • CL-X 
QJ Q> — C • • — ' • — 
t0C0a>CLC0.^L-C 
to CO O O — E t— o 



qj a)*D a) u u cn— — > C- > t_ QJ — 
— QJ QJ — <*-—. X) X5 t-"D Q>— CO (0 
QJ — t, C_ -^-» O-hJ C0 tO ^ Ot^b. >sX» 

u_ u. > u. L_ u. o 0 >- u_ o cn cnu. u. 
000000000 00 — E 0—0 



— C E 

t_ t * t_ 

o — — * *J t_ 



E o 



a >->->->- ^ ^ ^ >->->- u- —» >- 



— cr— » cn^r c 1 
o ^£ X) tu cnx) i 

05 L L • — J QJ I 



_J > _J _J O O _ 

x>ctoox:jCcccoojccc:-^ i 

01 o > — — U — — 0) — — — >• — — 

to co «<<:<:<: <: co co <: l. <c <c to — <c 
u era. 0.0.0.0.0-0- > — * o o 0 co a. 

^ CLX) X?OO>Q)QJt0O(-C0 O"— I 



zzxix^zzzzz cn cr: 



x:x)cncn>>c* 

CL> >s>sC C — * QJ 

— £ l"D*o*j-j cr o 



tO >% O 
— C 



O O > 

cr cr oj 



» • >s to co x> 
tu 0 > cr > c 
_ _ o 0) oil qj oj 
>*j > > — » — * >^j— » 
qj qj x) qj -c qj o 



in : 
t_ 

>N 

QJ 
XJ 
X) 



1 
I 
I 

cn 1 



QJ 

E CO o 

to 3 — o 

O — (0XJ 
C — CL O 

o — O C- 

— EC- X) —» 

c O — ' — » 

OX) CL to CO XJ O 
E 3 CO O O O 
D 0) O 03 Q> L. • • 
^ tO N >- >- <C 

a. <: 
• * - to 

O jO *J t) M OO U 
CO O O L_ <C _) O 
to — » — O 3 X O 



to 

CO 

c 
tu 
to 
c 
o 
o 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



22/34 




SUBSTITUTE SHEET (RULE 26) 



WO 95/35318 



PCT/US95/07744 



EIN3 cDNA 



23/3+' 



I (J 1 1 (J I TCTTCTTCCTCTTCCTCATCTCGTATCTCTAACTTTTGTCGMGTTCT 
TTTGATGAAACTAGGGTTTATTATCTTCTC CT IC1 II i 1 CCC ATCACC ATAGAA 
AAGGCAGAGAC C 1 1 I 1 1 C 1 1 CATCATTTTTATTCTCCTTCTTCTTCTGCTGT 
TCATTTCTCCAGGTTACAATGATGTTTAATGAGATGGGAATGTGTGGAAACAT 
GGA 1 1 101 1UIUI I CTGGATCACTTGGTGAAGTTGATTTCTGTCCTGTTCCACA 
AGCTGAGCCTGATTCCATTGTTGAAGATGACTATACTGATGATGAGATTGATG 
TTGATGMTTGGAGAGGAGGATGTGGAGAGACAAAATGCGGCTTAAACGTCT 
CAAGGAGCAGGATAAGGGTAAAGAAGGTGTTGATGCTGCTAAACAGAGGCA 
GTCTCAAGAGCAAGCTAGGAGGAAGAAAATGTCTAGAGCTCAAGATGGGATC 
7TGAAGTATATGTTGAAGATGATGGAAGTTTGTAAAGCTCAAGG(J I I IGI I I A T 
GGGATTATTCCX3GAGAATGGGAAGCCTGTGACTGGTGCTTCTGATAATTTAAG 
GGAGTGGTGGAAAGATMGGTTAGGTTTGATCGTAATGGTCCTGCGGCTATTA 
CCAAGTATCAAGCGGAGAATAATATCCCGGGGATTCATGAAGGTAATAACCC 
GATTGGACCGACTCCTCATACCTTGCAAGAGCTTCAAGACACGACTCTTGGA 
TCGCTTTTGTUTGCGTTGATGCMCACTGTGATCCTCCTCAGAGACGTTTTCC 
TTTGGAGAMGGAGTTCCTCCTCCGCGGTGGCCTAATGGGAAAGAGGATTGG 
TGGCCTCMCTTGGTTTGCCTAAAGATCAAGGTCCTGCACCTTACAAGAAGC 
CTCATGATTTGMGAAGGCGTGGAAAGTCGGCGTTTTGACTGCGGTTATCAA 
GCATATGTTTCCTGATATTGCTAAGATCCGTAAGCTCGTGAGGCAATCTAAAT 
GTTTGCAGGATAAGATGACTGCTAAAGAGAGTGCTACCTGGCTTGCTATTATT 
AACCMGAAGAGTCCTTGGCTAGAGAGCTTTATCCCGAGTCATGTCCACCTC 
TTTCTCrrGTCTGGTGGAAGTTGCTCGCTTCrGATGAATGATTGCAGTCAATAC 
GATGTTGAAGGTTTCGAGAAGGAGTCTCACTATGAAGTGGAAGAGCTCAAGC 
CAGAAAAAGTTATGAATTCTTCAAACTTTGGGATGGTTGCTAAAATGCATGAC 
TTTCCTGTCAAAGMGMGTCCCAGCAGGAAACTCGGAATTCATGAGAAAGA 
GAMGCCAAACAGAGATCTGMCACTATTA7GGACAGAACCGTTTTCACCTG 
CGAGAATCTTGGGTGTGCGCACAGCGAAATCAGCXJGGGGA! 1 1 G I GGATAG 
GAATTCGAGAGACAACCATCAACTGGCATGTCCACATCGAGACAGTCGCTTA 
CCGTATGGAGCAGCACCATCCAGGTTTCATGTCAATGAAGTTAAGCCTG 
TAGTTGGATTTCCTCAGCCAAGGCCAGTGAACTCAGTAGCCCAACCAATTGA 
CTTAACGGGTATAGTTCCTGAAGATGGACAGAAGATGATCTCAGAGCTCATG 
TCXJATGTACGACAGAAATGTCCAGAGCAACCAAACCTCTATGGTCATGGAAA 
ATCAAAGCGTGTCACTGCTTCAACCCACAGTCCATAACCATCAAGAACATCT 
CCAGTTCCCAGGAAACATGGTGGAAGGAAG I 1 1 CM I G AAGACTTG AACATC 
CCAAACAGAGCAMCAACAACAACAGCAGCAACAATCAAACGTTTTTTCAAG 
GGAACAACMCAACAACAATGTGTTTAAGTTCGACACTGCAGATCACAACAA 
CTTTGAAGCTGCACATAACAACAACAATAACAGTAGCGGCAACAGGTTCCAG 
CI I Li lb 1 1 1 GATTCCACACCGTTCGACATGGCGTCATTCGATTACAGAGATGA 
TATGTCGATGCCAGGAGTAGTAGGAACGATGGATGGAATGCAGCAGAAGCA 
GG XAGATGTATCCATATGGTTCT AAAGTCTTGGTAGTAGATTTCATCTTCTCTT 
Al 1 1 1 lATCTTTTGTGTTCTTACATTCACTCAACCATGTAATAI 1 1 1 1 ICCTGGG 
TCTCTCTGTCTCTATCX3CTTGTTATGATGTGTCTGTMGAGTCTCTAAAMCTC 
TCTGTTACTGTGTGTCTTTGTCTCGGCTTGGTGAATCTCTCTGTCATCATCAG 
CTTTTAGTTACACACCCXaACrrTGGGGATGAACGAACACTAAATGTAAGTTTTC 
A 
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HN3 genomic 24/34 



AGAGCAGTGAGTATTNCCACNAGCCGCTTTGTTMTTACATATTAATTGTGTA 
ATMTMTAATAAATGATGTCJTTAAATTTTATGTGTAAGAMTGAMTTAAAATG 
ATATATATGTATATTATATATCTANACATATATATATATATATAMTA 
ATACTAT6ATCTATCTTCCTGATCTACAGAGAGACTCCACAAAGAAACGCAAA 
TAAACAAAAGTCGCTTTCTAGCCACGTGATCTTTCGTCGAU I IIICIIGIICII 
CTTCTTCTrCCTCTTCCTCATCTCGTATCTCTAAC 1 1 I IGTCGAAGTTCI 1 1 IG 
ATGAAACTAGGGTTTATTATCTTCTCCTTCTTTTTCCCATCACCATAGAAAAGG 
CAGAGACC 1 1 1 1 ICTTCATCAI 1 1 1 1 ATTCTCCTTCTTCTTCTGCTGTTCATTTC 
TCCAG GTACTATACGC 1 1 C ' l ' l C TTCTATTG A 1 1111 1 A GGGTTATTATTG ATACT 
GAAGATGATGATAGGTTTATTCATAGGGTTTTACTAG^TCGATGGTTTTACTTT 
AGmACTAGTGTTTACACGATCTMmCATGAGmATNCTACTTTTAGTTTT 
TTT^TGGGTG^GTmGmATTGTTTATAMTCGlTGATCTAmGAAAATC 
JI1ICIG1IICI lATTCATATATGATCCmCTATATTTGGTTCCTATGTTGAAG 
ATCTCATC C 1 1 1 1 1 1 1 GGAMTTGAATCTGTTGATMTTTTTATTATCCGATTGA 
TrATTTAGTTTAGGAGTGATTAAMTACGATUTGATTATGTGTTrATTACTTAAA 
ACTTTG ATTGAATTCGAAAAGCCCC 1 1 1 1 1 1 ATAATTTAGGGTTTGATGA 1 1 1 1 1 
TTTAGTMGTTGTTTGATTCAGAAGAAATATAATTGTACTGATTAG 1 1 I IGI I lli 
TGTATT7GATTTGTTACAGGTTACAATGATGTTTM 

TCCACMGCTGAGCCTGATTCCATTGTTGMGATGACTATACTGATGA 

TTGATGTTGATGAATTGGAGAGGAGGATG7GGAGAGACAAAATGCGGCTTAA 

ACGTCTCAAGGAGCAGGATAAGGGTAAAGAAGGTGTTGATGCTGCTAAACAG 

AGGCAGTCTCMGAGCMGCTAGGAGGAAGAAAATGTCTAGAGCTCAAGATG 

GGATCTTGAAGTATATGTTGAAGATGATGGMGTTTGTAMGCTCAAGGCTTT 

GTTTATGGGATTATTCCGGAGAATGGGMGCCTGTGACTGGTGCTTCTGATAA 

TTTAAGGGAGTGGTGGAAAGATAAGGTTAGGTTTGATCGT AATGGTCCTGCGG 

CTATTACCAAGTATCAAGCGGAGAATAATATCCCGGGGATTCATGAAGGTAAT 

MCCCGATTGGACCGACTCCTCATACCT7GCAAGAGCTTCAAGACACGACT 

CrnGGATCGCTTTTGTCTGCGTTGATGCAACACTGTGATCCrrCCTCAGAGAC 

GTTTTCCTTTGGAGAAAGGAGTTCCTCCTCCGTGGTGGCCTAATGGGAAAGA 

GGATTGGTGGCCTCAACTTGGTTTGCCTAAAGATCAAGGTCCTGCACCTTAC 

AAGMGCCTCATGATTTGMGAAGGCGTGGAMGTCGGCGTTTTGACTGCGG 

TTATCAAGCATATGTTTCCTGATATTGCTAAGATCCGTAAGCTCGTGAGGCAA 

TCTAAATGTTTGCAG G ATAAG ATGACTG CTAAAGAG AGTG CTACCTGGCTTGC 

TATrATTAACCAAGAAGAGTCCTTGGCTAGAGAGCTTTATCCCGAGTCATGTC 

FIGURE 19B 
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EIN3 peptide 



MMFNEMGMCGNMDFFSSGSLGEVDFCPVPQAEPDSIVEDDYTDDBDVDELE 

RRMWRDKMRLKRU<EQDKGKEGVDAAKQRQSQEQARRKKMSRAQDGILKYM 

LKMMEVCKAQGFWGIIPENGKPVTGASDNLJREVVWKDKWFDRNGPAArrKYQ 

AENNIPGIHEGNNPIGPTPHTLQELQDTTLGSLLSALMQHCDPPQRRFPLEKGV 

PPPWWPNGKEDVVWPQLGLPKDCK3PAPYKKPHDLKKAVVKVGVLTAV1KHMFP 

DIAWRKLVRQSKCLQDKMTAKESATWLAIINQEESLARELYPESCPPLSLSGG 

SCSLLMNDCSQYDVEGFEKESHYEVEELKPEKVMNSSNFGMVAKMHDFPVK 

EEVPAGNSEFMRKRKPNRDLNT1MDRTVFTCENLGCAHSE1SRGFLDRNSRDN 

HQLACPHRDSRLPYGAAPSRFHVNEVKPWGFPQPRPVNSVAQPIDLTGIVPE 

DGQKMISELMSMYDRNVQSNQTSMVMENQSVSLLQPTVHNHQEHLQFPGN 

MVEGSFFEDLNIPNRANNNNSSNNQTFFQGNNNNNNVFKFDTADHNNFEAAH 

NNNNNSSGNRFQLVFDSTPFDMASFDYRDDMSMPGWGTMDGMQQKQQDV 
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BL1 CDNA ^/^f 



GGCC6CTTDAA ACTCTACAAACCCAGAAACCACCACACAGTAATTAATGTCT 

CIIIUI I ICI I OCCATGTGATCTTTMCAGACTTTTCTTCTTATTCTCCATCTC 

TGAAGTtGTG GGG ATTCATCAAGACTTCCTTATCTG 1 1 ICII 1 1 ATAAAACAA 

GAG AGAGATACCACTT7TGGTGTTCTTTATTTG CAACTCTTTCAGGTTAAAGA 

AATCGATAGGCTCTGTTCTTGATTGTGGTGGAAGAGAcATGATGATGTTTaAC 

GAGATGGGAATGTATGG AAACATGG A 1 1 1 (J I ICICTtCCTCCACATCTCTCGA 

tGTGtGtccATTACCACAAGCTGAACAAGAACCTGTagtTGAagaTGTCGACTACA 

CCGATGATGAGATGGATGAGCTTGAGCAGAGGATGTGGAGAGACAAAATGC 

GTTTGAAACGTCTCAAGGAGCAACAGAGTAAGTGTAAAGGAGGCGTCGATg 

GTTCGAAACAGAGGCAGTcgCaAGAGCAAGCT AGGAGGAAG AAAAlgtCTAGA 

GCCCAAGATGGGATCTTGAAGTATATGTTGAAGATGAtGGAAGTTTGTAAAG 

CTCMGGCTTTGTTTATGGTATTATTCCTGAGAAGGGTAAGCCTGTGACTGG 

tGCTTCGGATaATTTGAGGGAATGGTgGAAAGATAAGGTTAGGTTTGATCGTA 

ATGGTCCAgCTGCTATTGCTAAGTATCAGtCAGAGAATaATATTTCTGGAGGG 

AGTAATGATTGTAACAGCTTGGTTGGTCCAACACcgcATACGcTTCAGGAGCT 

TCAGGACACGACTCTTGGTTCgCTTTrATCGGC7TrGATGCAACATTGTGAT 

CCACCGCAGAGACGGTrTCCTTTGgaGAAaGGAGTTTCTcCACCTTGGTGGC 

CTAATGGGAATGAAGAgtgGTGGccTcaGCTtgGtTTACCAAATGAGCAAGGTCC 

TCCTCCTTATMGMGCCTCATGATTTGAAGAAAGCTTGGAAAgTCGGTGTTT 

TaACTGCGGTGATCAAGCATATgTCGCCGGATATTGCGAAGATCCGTAAGCT 

TGTGAGGCAATCAAAATGCTTgCAGGATAAGATGACGGCGAAAGAGAGTGC 

TACTTGGCTTGCCATTATTAACCAAGAAGAGGTTGTGGCTCGGGAgCTTTAT 

CCCGAGTCATGCCCTCCTCTTTCTTCTTCTTCATCATTAGGAAGCGGGTCGC 

TtcTCATTAATGATTGTAGCGAGTATGACGTTGaAGGTTTCGAGAAGGaACaA 

CATGGTTTCGATGTGGaAGAGCGGAAACCAGAGATAGTGATGATgCATCCTC 

TAgCAAGCTTTGGGGTTgCTAAAATGCAACATTTTCCCATAAAGGAGGAGGT 

CgCCAcCACGGTAAACTTAGAGTICACGAGAAAGAGGAAGCAGAACAATGAT 

ATGAATGTTATGGTAATGGACAGATCAGcAGGTTACACtTGTGAGaATGGTca 

GTGTCCTCACAGCAAAATGAaTCTTGGATTTCAAGACAGGAGTTCAAGGGAC 

AACCACCAGATgGTTTGTCCATATAGAGACAATCGTTTAGCGTATGGAGCAT 

CCAAGmcATATGGGTGGAAlGAAACTAGTAGTTCCTCAGCAAcCAGTCCaa 

CCGATCGACcTATCGGGCGTTGGAGTTCCGGAAAACGGGCaGAAGATGAT 

CACCGAGCTTATGGCCATGTACGACAGAAATGTCCAAAGCAACCAAACGCC 

TCCTACTTTGATGGAAAACCAAAGCATCGTCATTGAT6CAAAAGCAGCTCAG 

AATCAGCAGCTGMTTTCAACAGTGGCAATCAAATGTTTATGCAACAAGGGA 

S^?i^^ GQOTMCM ^ GTTO ^ AT GGTCmGATTCGACACCATr 

? G ^I^IS^ QCA ^^^^^ AGAT G A ^GCAMCCGGAGCAATGGA 

A S G ^IS G 3S AAGCAGCAGCAGCAGCAecAQ CAGCAGCAaAGATGTA^ 

£E?JT A SH A ^ ACCTGA ^ GG GTATGTATrCTATTGCACCAAACACTCAT 

?3^I A ?]S TrGATGATGATGAAGCCATCT AI » 1 1 1 1 1 1 1 1 G TGTCTGAAAGTC 

A I[^CTCGCTTCATTGTTTT^ 

ATGCTACAAGTTTGATTCTTTGAGGCGGCCGC 
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Ufa 



EIL1 peptide 

MMMFNEMGMYGNMDFFSSSTSLDVCPLPQAEQEPWEDVDYTDDEMDVDE 

LEKRMVVRDKMRLKRLKECK2SKCKEGVDGSKQRQSQEQARRKKMSRAQDGIL 

KYMLKMMEVCKAQGFVYGIIPEKGKPVTGASDNLREWWKDKVRFDRNGPAAIA 

KYQSENNISGGSNDCNSLVGPTPHTLQELQDTTLGSLLSALMQHCDPPQRRF 

PLEKGVSPPVmPNGNEEVW^QLGLPNEGK3PPPYKKPHDLJ<KAWKVGVLTAV 

IKHMSPDIAK1RKLVRQSKCLQDKMTAKESATWLAIINQEEWARELYPESCPPL 

SSSSSLGSGSUJNDCSEYDVEGFEKEQHGFDVEERKPEIVMMHPLASFGVA 

KMQHFPIKEEVATWNLEFTRKRKQNNDMNVMVMDRSAGYTCENGQCPHSKM 

NLGFQDRSSRDNHQMVCPYRDNRLAYGASKFHMGGMKLWPQQPVQPIDLS 

GVGVPENGQKMITELMAMYDRNVQSNQTPPTLMENQSMVIDAKAAQNQQLNF 

NSGNQMFMQQGTNNIGVNNRFQMVFDSTPFDMAAFDYRDDWQTGAMEGMGK 
QQQQQQQQQDVSIWF 
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EIL2 CDNA 



^X^T^GOGCX^ATTTACAQAGGGACATATGTGTTCT 
GAGAAGMGATCTGGAGAGACMGCAGCGT7TAMGCGGCTCMGGAAATG 

GAAGTACATGTCGAAGACAATGGAGCGATATAAAGCTCAAGGTTTTCT^ATC 
GGA "J3 G I G J TAGAGAA ^ GGAA ^ CGCT 

TGMTGGTGGAMGACAAAGTGAGGTTTGATAGGMCGGCCCAGCTGCTATA 
ATCMGC^CCAAAGGGATATCMTXJTTTCTGATGGMGTGATTCAGGGTCTGA 
GGTTGGGGATTCTACCGCACAGMGTTGCTTGAGCTTC^ 
^^CJ^^ATCGGCT^ 

TT^ CG n GGAGAAAGGCGTGACAC ^ CCA TGGTGGCCAACGGGGAAAGAAG 
AJTCGTGGGATCAACTGTCnTrAC^^ 

GGTUTAGAAGT7TGCAGGAGAAAATGACGTCAAGAGAAGGCGC 
^I^S?!^? 1 ^ 101 1 1 ACCGAGAAAAGGCTATTGTTGATCAAATAGCCA 

I G ^ AGAGA j^ AACAA ^ 
^CCA^ATGTTTC 

J GGG ^CATCGGACCMTCAGCA^ 

T^I£ AGAAGAG /^^ GAAG MGA^ 

ACE^CATGTGAGMC^^^ 

9» A S^95XIHI A P ^ AACTAAA ^ CTATG GTATGACG GGTTTAATCGTrC 
GA J G 4^^ G ^ GAGGA ^ 

AAAG ^GCTTCAGAG 1 1 1 1 Ul 1 1 1 lA rGTTrTCTAGTCTTTATAGCTTrGTCTC 

TTG^ATTCTCTCATTAAACACAGTTmGATCTCTCCA 

TAGCMTGGAGMGATTAGGTTTCATMTAAG^AA'^^CAMT^^A^^^ 
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EIL2 peptide 



DSMDMYNNNIGMFRSLVCSSAPPFTE6HMCSDSHTALCDDLSSDEEMEIEEL 

EKKIWRDKQRLKRLKEMAKNGLGTRLLLKQQHDDFPEHSSKRTMYKAQDGILX 

YMSKTMERYKACKaR/YGtVLH^GKTVAGSSDNLREVVWKDKVRFDRNQPAAIIK 

HQRDINLSDGSDSGSEVGDSTAQKLLELQDTTLGALLSALFPHCNPPQRRFPL 

EKG\nPPWVVPTGKEDWV\TOQL5LPVDFRGN^PPYKKPHDLKKLVVKIGVLIGVIR 

HMASDISNIPNLVRRSRSLQEKMTS REGAL WLAALYREKAIVDQIAMSRENNNT 

SNFLVPATGGDPD\^PESTDYDVEUGGTHRTNQQYPEFENNYNCVYKRKFE 

EDFGMPMHPTLLTCENSLCPYSQPHMGFLDRNLRENHQMTCPYKVTSFYQPT 

KPYGMTGLMVPCPDYNGMQQQVQSFQDQFNHPNDLYRPKAPQRGNDDLVED 

LNPSPSTLNQNLGLVLPTDFNGGEE7VGTENNLHNQGQELPTSWIQ 
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EIL3 CDMA 50/J4 



TTCCCCTGAGAACGACAGGAGAAAGAATAAAAACCCTAA A'1 1 101 1 IA ATTTC 

GGCGCTTCAGATTATCGTTGTTAAAGG 1 1 1 1 IGATTGA7TTTGTTTAAATGGGC 

GATCTTGCTATGTt^GTAGCAGiACATCAGGATGGAGMTGAGCXrrGATGATr 

TAGCTAGTGATMTGTTCCTGAGATTGATCTGAGTGATGMGAGATTGATGCT 

GACGACCTTGAGAGACXaGATGTGGAAAGATCGTGTCAGGCTTAAAAGAATCA 

AAGAGCGACAAAAAGCTGGCTCTCAAGGAGCTCAAAAC6AAGGGAGACACC 

TMGAAAATUTCTGATCAAGCTCAGAGGMGAAAATGTCTTAGAGCTCAAGAT 

GGTATCCTTMGTACATTGTTGMGCTTATGGAAGTCTGCAMGTTCGCGGGT 

TTGTCTATGGTATAATACCGGAAAAGGGCAAGCCTGTGAGTTGGCTCCTCTG 

ACAATATMGAGCTTGGTGGAAAGAGAAAGTGAAGTTTGATAAGAaCGGTCCT 

GCTCCTATTGCTAAATACGAAGAGGAGTGTTTAGCGTTTGGGAAATCTGATGG 

GAATAGGMTTCACAGTTTGTTCTCCAGGATTTGCAAGATGCTACTTTAGGGT 

CTTT6TTAIUI ICI 1 1 GATGCAACATTGTGATCCTCCTCAAAGGAAGTATCCGT 

TGGAGAAAGGGACGCCTCCGCCTTGGTGGCCAACGGGGAATGAAGAATGGT 

GGGTGAAACTCXaGTCTGCCTAAAAGCCAGAGTCCTCCTT ACCGAAAACCTC 

ATGATCTCMGAAGATGTCGAAGGTTGGAGTTTTAACGGCAGTGATCAATCAT 

ATGTTACCTGATATTGCAAAGATTAAGAGGCATGTrCGTCAGTCGAAATGTn" 

ACAGGACMGATGACAGCTAAAGAGAGTGCGATTTGGTTGGCGGTTTTGAAC 

CAAGAGGAATCTTTGATTCAGCAGCCTAGCAGTGACAATGGAAACTCCAATG 

TGACTGAGACACATCGTAGGGGTAATAACGCTGACAGGAGGAAACCTGTGGT 

CAACAGTGACAGTGACTATGATGTTGATGGGACAGAGGAAGCTTCAGGTTCA 

GTTTCATCTAMGACAGTAGAAGAAATCAGATTCAAAAAGAACAACCAACAG 

CCATCTCACATTCAGTAAGAGATCAAGATAAAGCAGAGAAACATCGCAGAAG 

GAAAAGACCTCGAATTAGATCCGGAACTGTCAATCGACAAGAGGAAGAACAA 

CCTGAAGCTCAACAAAGAAACATCTTACCT6ATATGAATCATGTTGATGCCC 

CTCTGCTAGAATATAACATCAACGGTACTCATCAAGAGGACGATGTTGTCGA 

CCCAMTATTGCCTTAGGACCAGAGGATaATGgTCTGGAACTAGTGGTTCCTG 

AGrFCAATAaCCaaaCATACTTATCTTCCACTTGTTAATGAACAAACTATGATGC 

CTGTAG ACG AaAGGCCAATGCTTTATGG ACCCAAACCCT AACCAAGAGCT 

TCMTTTGGGTCAGGGTACMCTTCTACAATCCCTCTGCAGTGTTTGTACATA 

ACCAGGAAGACGACATTCTCCATACACAGATAGAAATGAATACACAAGCACC 

AC CTCA CAACAGTGGGTTCGAGGAGGCCCCAGGAGGAGTACTTCAACCCCT 

TGGTTTACTCGGAAATGAAGACGGTGTAACAGGGAGTGAGTTGCCTCAGTAT 

CAGAGTGGCATTUrGTCTCCATTGACTOACTTGGACTTTGACTATGGTGGTTT 

TGGTGATGATTTCTrCATCGTTTGGAGCTTAGTGTCTTGCC A 1 1 1 1 1 1 1 IG GGAG 

ATTACAT AGTTCAA AAGGACATGGCMTAGTC TGGCT AGTACAGTTACTTTCT 

CTTCTTCAI I ICI ICIGATCT TATAT T CI ICC1CI I M 1 1 ICI IA TAAT AI 1 1 ICI 

TAGATTTGTTAAGAGAAACAA 1 1 1 ICCI 1 1 1 GAATAAGTTGCCAGAAGAACTGC 

TTTGCCXX3TTGTMTGGTCTCTAGGGAAAGCAGTTAGCX3TATCATCATTTGTA 

AATTTACCTGTGAG 
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HLS1 cDNA: 



3//J4 



CTCCAACTTTTAAAACTCATCATAAATAGTAAAAAAGTAGCCGGAAAAATAAA 
ATAAAMGTCTATTTlCTCTTrCCTTTAAMTCCA 

TTCTCIGi l(J\ 1 1 ACTTATAGCTCACGTTATACATATATATAG AGTTTCTATA 

AATSCnCICI 1 1 CCTCTCG AACAAATCTTCCTCACXTTCTCTCATTTCCACAC 

TCACXrrreCTgrCTATATATTAAACCCTATCTACTTMCTCTTO 1 ICIAACTCT 

AATCTCTCT CTCTATTTACTCTGCTTCTGTTCTCACTCTGAAAGAACCAAAAC 

ATGACGGTGGTTAGAGAGTACGACXXX3ACCCGAGACTTAGTCGGCGTGGAG 

6ACGT GGAAC GACGGTGTGAAGTCGGACCAAGCGGCAAG C 1 1 ILilLil 1 1 IQ A 

CCGACCTTTTGGGTGACXXIGATTTGTAGMTCCGACATTCACCTTCCTATCT 

C^TGCTGGTGGCTGAGATGGGTACGGAGAAGAAGGAGATAGTGGGCATGATT 

AGA^ATGTATCAAAACCGTTACATGTGGCCAAAAACTCGATTTAAATCACAA 

ATirrCAAAACX^TGTCGTTAAGCX;TCTTrACACTAAACTCGCrrrACGTCTTGG 

GCCTTCGCGTCTCTCCTTTTCACAGGAGACAAGGGATTGGGTTTAAGCTCGT 

GAAGATGATGGAGGAATGGTTTAGACAAAACGGAGCTGAGTATTCGTATATTG 

CMCTGAGMCGATAATCAAGCTTCTGTGMTTTGTTCACCGGGAAATGTGGT 

TATTCGGAGTTTCGTACACCGTCX3ATTTTGGTTAACCCGGTTTACGCTCATCG 

AGTTAATGTTTCGCGGCGAGTCACGGTTATCAAGTTAGAGCCGGTTGATGCT 

GAGACGTTGTACCGAATCCGGTrrAGCACAACAGAGTTTTTCCCGCGGGATA 

TTGATTCGGTACTTAATAACAAACTCTCGCTTGGGACTTTCGTCGCGGTGCCA 

CGTGGAAGCTGTTATGGATCCGGGTCTGGATCATGGCCCGGTTCGGCTAAAT 

TCCTCGAATATCCACCCGAGTCATGGGCCGTATTAAGC6TGTGGAATTGTAA 

AGACTCGTTTCTGTTAGAAGTACGTBGAGCGTCGAGATTGAGACGTGTGGTG 

GCTAAMCGACGTCAGTAGTTGATAAAACXalTGCCGTTTCTGAAACTACCTT 

CGATACCGTCCX37TTTCGMCCTTrTGGACnTCATTTTATGTATGGAATCGGA 

S?A^ A 5^ T ^ A ^^^ GTG ^^ TGGTCA ^ T CCTTGTGTGCTCACGCG 

CATAACTnaGCTAAGGCAGGTGGTTGTGGTGTCGTGGCGGCGGAAGTTGCX; 

S G ^ A ^^ CC ^^ GCGG< ^ GG ^ TA ^ A CATTGGAAAGTGCTATCGTGT 

GA 5 G ^ GATCmGGTGTATAAAGCGGG TTGGAGAT^ 

TGTTGGTGATTGGACTAMTCGCCACCTGGCGTTTCX^ATrnTGTAGACCCT 

^GAATTTTAAAAUI 1 1 1 1 1 1 1 1 AACTCTATMTATATATTCTCTA7TAACCACT 

^LL'L 1 1 1 1 AGGTAACTTTTTTTGC 1 I J I IGI 1 1 1 GTTTTGTTTTGTTTTTGTGG 
oTGTTATAAATTA 
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HLS1 genomic sequence: 




gccgtttaittttgtgmatgatto^ ^ 
m wHTff a ™ ctt ^^ ca ^ ca ^g^ a ^^g a ^cggctffigagtttaattaactaatt 



a^ectttaggaamgttttmcttaaaatggattc^ 

285582!^^ 

^t^t ^^^^ 9Ca ^ toB39rt ^^^ cal ^ a9taattt 99 ta 9 ata 9 



ct^c^c^^ctcccacttacaacttctccttctgga^cmttccat^eaatgrt^^ a ^ r^tta 



gtj^atotaagag^-raratattto^ 



Jjff^ctaltwcrtcttaacacataaaagtaaaaaaag 



cmrrcTcnnn^ 

I^^I A CGACCCGACCCGAGACrrrAGTCGGCGTGGAGGAC^GAACG 
Jf^^^^^^SSttaatccaaaaaacccatttU^ 



gtaactataactaataaac 
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CGTGGAAGCTGTTATGGATCCGGGTCTGGATCATGGCCCGGTTCGGCTAAAT 

TCCTCGMTATCCACCCQAGTCATGGGCC^TATTMGCGTGTGGMTrciTAA 

AGACIOal I I AGAAGTACGTCGAGCGTCGAGATTG AGACGTGTGGTG 

GCTAAMCGAt^CG AGTAGTTGATAAAACGTTGCCGTTTCTG AAACT ACCTT 

CGATACCGTCCG 1 1 1 1 (JGAACU 1 1 1 1 GGACTTCATTTTATGTATGGAATCGGA 

GGAGAAGGTCCACGCGCXaGTGAAGATGGTGAAATCCTTGTGTGCTCACGCXa 

CATAACTTGGCTAAGGCAGGTGGTTGTGGTGTCGTGGCGGCGGAAGTTGCC 

GGAGAAGACCCGTTGCGGCGAGGAATACCACATTGGAAAGTGCTATCGTGT 

GAa^GGATCrrrTGGTGTATAAAGCGGCTTGGAGATGACTATAGTGATGGTGT 

TGTTGGTG ATTGGACTAAATCGCCACCTGGCGTTTCCATTTTTGTAG ACCCT A 

GAG AATTTTAAAAG 1 1 1 1 1 1 1 1 1 AACTCTATAATATATATTCTCTATTAACCACTT 

GATGTTAAATTAGGGG 1 1 1 101 1 C TAAGTTTATAG A 1 1 1 IUI 1 G TTTTAGAATTA 

AT lil 1 1 III 1 1A GGTAA UI 1 1 1 1 1 1 GUI 1 1 1 1 GTTTTGTTTTGTTTTG 1 1 1 1 IG T6G 

GTGTTATAMTTAgtggtaagaggtaatati^cctacttttgggWgtgtcttcttgtcttgtaaatggatctagc 

tttttaagatactttttctttgtggccaaaccaaaacgccgaeetgattattatttccaagtagataaaattteatgaac 

gcactgatacgtataatgatgcaatttgtgttaagacgatactttggagataaaattacaatatgacaatgataga 

aaatgttaccaataacgattagcattatcgtgtgtgccatcaagtataactaagagaaagacgcacattttettta 

agagtaaataaaatatt 
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